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1.1 From code breaking to cryptanalysis 

Initially, the use of cryptography consisted of encrypting messages. The usual encryption methods 
include substituting each letter in the alphabet, or transposing the order of writings. In the era when 
all computations had to rely on manual labour, such encryption was enough to resist the attacks by an 
intruder. However, as there was no notion of security and no thorough analysis of encryption methods, 
no one knew exactly how difficult it was to break a seemingly well-concealed message. 

1.1.1 Examples of some classical systems 

Games between cryptographers and code-breakers are a common plot in stories. Among the most 
classical and naive systems are the substitution code and the transposing code. For example, in Arthur 
Conan Doyle's novels, Sherlock Holmes encountered an encrypted message in the story The adventure 
of the dancing men. The encrypted message between the criminal and Elsie was formed by a sequence 
of dancing men. Holmes was able to find out the concealed content by comparing the frequency of 
symbols with that of natural English texts. 
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Figure 1.1: The dancing men stand for different English letters in these secret messages 

yj'TXTfJJWirw 

criminal s msssaga (1) 

ITXtTWTZ 

criminals message (21 

chit 

Elsie s reply 

iryxxxxiiTjfx 

criminals message <3) 


In another famous novel "The journey to the center of the earth" by Jules Verne, the entrance of 
the passage to the center of the earth was encrypted using a transposition cipher. It took Professor 
Lidenbrock and his nephew Alex two days to decrypt. 

In those days, cracking a ciphertext depended much on one's luck and knowledge of languages, 
apart from mathematics. People generally could not predict what was essential to break an encryption 
scheme. No wonder when the United Kingdom wanted to recruit talented people to decrypt telegrams 
from the German army during World War II, some crossword experts, chess champions were also in¬ 
vited to Bletchley park to participate to their cryptanalysis actions. 

1.1.2 The role of cryptanalysis 

The role of cryptanalysis is to provide an reliable analysis of the security of the encryption method, 
usually by making use of complicated mathematical tools, thereby provide confidence in the security 
of the system, and provide arguments to set up parameters, or propose possible ways of improving the 
security. 

For instance, the previous story well illustrated several typical characteristics of substitution ciphers. 
In the story, Sherlock Holmes guessed correctly that each dancing man was a representation of an En¬ 
glish letter, and the whole message was an English sentence with flags separating words. By analysing 

the frequency of symbols, it is not hard to notice that the symbol ^ comes up much more often than 
any other symbol, and therefore it most probably stands for the letter 'e', which appears in predominat¬ 
ing high frequency in common English sentences. But since there was too few symbols in the beginning. 
Holmes was not able to recover the hidden message, until he saw the third message which consisted of 
only five letters. Having the two E's coming second and fourth in a word of five letters, which serves 
as a reply to an appeal, the most probable match would be 'never'. This discloses the letter 'N', 'V', 
and 'R'. Consequently, Holmes was able to decode the whole message by guessing from the finitely 
many possibilities of words, and thereby decode the malicious messages sent by the criminal. Finally, 
he wrote such an invitation message in the dancing men code, which successfully allured the criminal 
and arrested him. 

After all, the breaking of a single ciphertext is not the finishing point of cryptanalysis. More impor¬ 
tant is to explore essential features of substitution code in general. 
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1. Frequency analysis is an effective mean to attack the substitution code. In a normal English text, the 
frequency of certain letters is considerably higher than others. When these letters are replaced by 
symbols, their frequencies remain unchanged. Consequently, we can compare these frequencies 
to that of normal English texts, which will reveal the correspondence between the symbols and 
the letters they stand for. Apart from English, in every natural language, each symbol in alphabet 
has a characteristic frequency, which can serve to analyze a substitution code. 


Figure 1.2: Frequency of letters in English texts [52] 



2. Substitution code can not change the length of a message, neither is it able to conceal special pat¬ 
terns of the text, and in some cases, this will leak important clues for decryption. A recognizable 
pattern or a particularly short message would usually facilitate attacks. 


Upon the analysis of the substitution cipher, it can be concluded that it is not secure to use substitu¬ 
tions directly over a plaintext, when each letter in the alphabet has an apparent characteristic frequency, 
and the total size of the alphabet is small. The unicity distance describes minimum numbers of letters 
required to break the substitution cipher. For simple substitution, the unicity distance is 27.6. The ef¬ 
fectiveness of frequency analysis increases when the length of encrypted message grows. In practice, 
the number of letters required to break a substitution cipher is usually 2-4 times of the theoretical lower 
bound, which means 50 to 100 letters for a substitution ciphertext. This also effectively points out how 
we should improve it. 

In order to resist the frequency analysis to a reasonable degree, one possible extension of substitution 
code is to use a more complicated substitution strategy, for example, instead of substituting each single 
letter, it is possible to substitute each pair of consecutive letters (digraphs) in the message. Examples 
of digraphic ciphers include playfair ciphers and four-square ciphers. These ciphers have stronger 
resistance against frequency attacks. Another way is to change the substitution rule constantly so that 
there is no longer a fixed mapping between the symbols in the ciphertext and the plaintext. This was 
the principal underlying idea for motor machines, which is a type of electro-mechanic device. The 
substitution rule is dependent on the combination of rotors, whose position changes with typing. A 
famous representative of rotor machine is the Enigma machine used by German army during World 
War II, which required tremendous effort to break down the code. 
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1.2 Modern Cryptology 

1.2.1 Basic concepts of modern cryptology 

Cryptology came into public study long after World War II, and it soon became a domain closely related 
to applied mathematics. In one hand, the notion of security gradually developed into increasingly 
rigorous form. On the other, the content of cryptography was also enriched to achieve various purposes, 
cryptography is no longer restricted to encryption alone. 

Following is some of the most important notions which underlies modern cryptography: 

Public design It has become widely recognized that modern design of cryptography should comply 
with Kerckhoffs's principle: A cryptosystem should remain secure even if everything about the 
system, except the key, is public knowledge. This has two advantages, first of all, we will only 
have to keep secret a small amount of information, which is the key, and in case it falls in hands of 
the adversaries, we would only need to change the key, not the entire cryptosystem. In addition, 
this allows public analysis of the cryptosystems, so that we have a better idea of its security level, 
and its potential weakness. 

Security level The popularity of computers and internet greatly shaped the evolution of cryptography. 
On one hand, the design is often based on the communication requirement of the digital world. 
On the other hand, the security requirement is naturally based on computational power, which 
is growing rapidly. For any cryptosystems, we need to clarify its security level before it can be 
applied in practice. The security level is defined with regard to best known attack to the system. If 
the best attack against a crypto-system requires 2" operations, then it is said to have n-bit security. 
The type of operation should not be too complex, especially it should not be too far from a CPU 
clock cycle. 

Today, the clock rate of a CPU is about 2 31 Hz, and a single core can perform approximately 2 26 
multiplications of 128-bit integers per second. And the number of PCs sold is about 200 millions 
per year. While it is already possible to have supercomputer of about 2 21 cores, in a single day it is 
feasible to carry out more than 2 60 multiplications of 128-bit integers. With the Moore's law which 
predicts an exponential increase in computational power, this number is expected to increase at a 
constant speed. Therefore a secure cryptosystem should have at least 80 -128 security bits. 

Primitives and protocols Cryptographic primitives are well-studied fundamental algorithms which 
are considered as reliable. They serve as building blocks for more complex security applications. 
A cryptographic protocol describes how these cryptographic primitives should be used in the ap¬ 
plication, and how the different parties should use the information and interact with each other. 
A cryptographic primitive would need to undergo strict analysis before it is recognized as secure. 
They usually accomplish a basic task such as encryption, or signature alone. While a protocol 
makes comprehensive use of different cryptographic primitives and tools, inventing new rules 
of communications. Such protocols ensures both security and efficiency, for example, electronic- 
voting, auction protocols are becoming more prevalent. 

1.2.2 Cryptography in different security aspects 

Modern cryptology is not restricted to encryption. It also provides a lot of other functions like veri¬ 
fication of identity, authentication of messages, etc. In general, they aim to guarantee confidentiality, 
integrity and authenticity of data. Cryptosystems try to achieve one or several of the following goals: 


— 12— 



1.2. Modern Cryptology 


Confidentiality Only the holder of the secret key can decrypt the message. All other unauthorized 
parties are not supposed to obtain any knowledge, even one-bit of information of the message 
except for the message length. 

Authenticity There should be a way to guarantee that the message is indeed sent by the claimed sender. 
Any one else would not be able to forge an identity. 

Integrity If the content of message is modified by anyone other than the sender, then the receiver 
should be able to detect this fact. 

Non-repudiation The sender of the messages can not deny having sent them. 

There are usually two different parties involved: the first party performs an operation to protect 
and transform the information into a disguised form, be it encryption, signature or other operations; the 
second party, who receives the disguised information, uses another operation to interpret it. Depending 
on whether the two parties share the key, cryptosystems are divided into following two categories: 
symmetric cryptography and asymmetric cryptography. 

1.2.3 Symmetric cryptography 

In symmetric cryptography, the secret key is shared between two parties in the conversation. The party 
must agree upon this secret key before the conversation takes place in a public channel. It is called sym¬ 
metric because every participant in the conversation have symmetric role and share the same messages. 
It is also called secret-key cryptography because its key should be kept only by trusted parties. 

Usually, symmetric cryptosystems enjoy the advantage of being efficient and inexpensive for hard¬ 
ware and software implementations. Particularly, for encryption, the ciphertext would not be signifi¬ 
cantly longer than the original text. On the other hand, it has a drawback: Especially, we should always 
share the secret key before any conversation can take place, thus if there is no prior communication 
between different parties, they can not start their conversation directly by symmetric cryptosystems. 

The security of symmetric cryptography is usually upper bounded by the size of its key, due to 
brute-force attack. That is to say, we can always enumerate all possible keys and check if decryption 
makes sense. That is the reason why single-DES system whose key size is fixed to be 56-bit is no longer 
considered secure any more. Several specialized machines had managed to break down DES codes in 
practice, and some of them can do this in less than one day. As the successor of DES, the AES system 
has a key size of 128 bits. 

1.2.4 Asymmetric cryptography 

This kind of cryptosystems require a pair of keys, one of which is public, and another is secret. While 
any one can verify a signature or encrypt a message with the public key, only the holder of the secret key 
can decrypt the cypher text, and sign a message. The secret key and the public key are mathematically 
related, but it should be computationally infeasible to obtain the secret key from the knowledge of the 
public key. These systems are also called public-key cryptography. 

In contrast with symmetric cryptography, where the secret key is shared between all participants in a 
conversion, in asymmetric cryptography, the secret key is not known to all parties in a communication. 
The holder of the secret key has a central position in the interaction. For example, in the case of public- 
key encryption, every body can use the public key to encrypt messages, but only the owner of the secret 
key can decrypt them. The roles of the participants in communication are asymmetric. Apart from 
encryption, it also has applications in signature, e-voting, etc. 
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For this type of communication, the different parties in communication are not required to share any 
secret in advance. But the price is that the encrypted messages in public-key systems are usually much 
larger than the plaintext, and the computational cost is higher. For practical reason, asymmetric systems 
are often used together with symmetric key systems for long conversations. As the first step, different 
parties will agree on a secret using asymmetric cryptosystems. This secret is later used as secret key of 
symmetric cryptosystems when they carry on with their conversation using symmetric cryptosystems. 

In public-key cryptology, only one party owns the secret key, and one never needs to reveal this 
secret key to other parties during any kind of cryptographic operations. Therefore, the fact of holding a 
secret key is often regarded as an identity prover, especially in various signature schemes. In symmetric 
cryptography, the life cycle of a key is usually limited to the duration of the conversation, because the 
secret key is shared between different parties highlights the problem of secret key disclosure. But the 
public-secret key pair in asymmetric cryptography is kept much longer. This is necessary as a signature 
should be verifiable for the same public key during some reasonable duration. 

1.3 Public-key cryptography 

Public-key cryptography depends on mathematical problems which has no efficient solution. With 
these problems, one creates a one-way function / (•) which can be obtained with the public key, so that 
everyone can calculate the mapping x —> f{x). However, it is not possible to compute the inverse of the 
function, unless we have the some additional hint. This additional hint is the secret key. 

The first public-key system was proposed by Diffie and Heilman in 1976. Their system uses the 
property that calculating discrete logarithms in certain groups is difficult. Soon after the RSA scheme 
was proposed, its security was related to the hardness of integer factorization. Not all difficult mathe¬ 
matical problems can yield public-key cryptosystems. Up to now, the most successful cryptosystems fall 
into the following categories of problems, the discrete Logarithmic problem over finite fields or ellip¬ 
tic curves, the RSA problem, lattice-based hard problems, coding-theory problems and multi-variable 
problems. 

1.3.1 Security analysis 

Unlike symmetric cryptography, the hardness of the problem is not dependent only on the bit-size of the 
secret key alone. More importantly, it relies on the difficulty of the mathematical problem. In most of the 
cases, we can attack a cryptosystem by solving one instance of the underlying mathematical problem. 
That is to say that the security of the system is at most as hard as the underlying mathematical problem. 
We would hope to prove the inverse, such that we can reduce the mathematical problem to the task of 
attacking a system. But this direction does not always hold. For example, it is conjectured that the RSA 
scheme is as difficult as factoring big integers, and we already have some partial proofs [1,16] which 
seems to suggest an affirmative answer to this conjecture. 

For integer factorization and discrete logarithms, we do not yet know their complexity classes. In 
fact, even though a problem is NP-hard, it does not mean that every instance of this problem is difficult 
to solve. Therefore, estimating the precise security of any cryptosystem is more complicated. There are 
two ways to do the security estimation: 

Security proof by reduction If any algorithm breaking down the cryptosystem can be efficiently trans¬ 
formed to solve an instance of a mathematical problem which is conjectured to be hard. Then 
breaking this cryptosystem is considered to be as hard as solving the underlying mathematical 
problem. 
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Security by cryptanalysis For many other cryptosystems, providing a direct security proof is difficult. 
However, all cryptosystems will be subject to cryptanalysis. If the research community is unable 
to raise an effective attack after many years of study, it brings confidence to the security. 

With respect to security notion, we should always be clear about the assumptions. According to the 
amount of information the attacker has access. 

Ciphertext only attack The adversary has access to the ciphertext and he tries to recover the plaintext 
from the ciphertext. This is the most general assumption, since encrypted messages are transmit¬ 
ted by a public channel, and the ciphertext can easily fall in hands of any potential eavesdropper. 

Known-plaintext attack The adversary possesses some plaintext-ciphertext pairs {(pi,Ci), (P 2 /C 2 ), 
..., (p„ c,)}. With this information, he tries to compute the key, or he obtains information of 
the plaintext of a new ciphertext c !+ 1 . 

This can happen when the encrypted message is of some fixed format, or when some previously 
classified information became public in later years. But still, the attacker can not deliberately 
choose any plaintext nor ciphertext that he wants to know. 

Chosen-plaintext attack The adversary can choose some plaintext {pi,..., p z } and ask the cryptosys¬ 
tem to send back their corresponding ciphertext {c \,... ,c, }. Using these information, the adver¬ 
sary tries to launch attacks. 

Chosen-ciphertext attack The adversary can choose some ciphertext {c 1 ,..., c ,} and ask the cryptosys¬ 
tem to send back their corresponding plaintext {pi,..., p,}, and tries to launch attacks. 

1.3.2 Hard problems and NP-completeness 

Not all computationally difficult mathematical problems can yield one-way trapdoor functions. In order 
for a series of problems to be used as trapdoor functions, there must be a way to express the problems 
as a trapdoor function, and we should be able to efficiently sample from the set of problems. Here is 
some examples of hard problems: 

Discrete Logarithms The discrete logarithm can be defined for arbitrary finite cyclic groups. As cyclic 
groups are isomorphic to Z/nZ, we use the group Z/nZ to describe the problem. Let g be a 
generator in the group Z/nZ, For any integer x, computing y = g x mod p requires 0(log(n)) 
multiplications. Conversely, when given y, computing x such that y = g x mod p is called the 
discrete logarithmic problem(DLP). In fact, if we know n can be factored into products of smaller 
integers, than the problem can be reduced to several smaller problems using Chinese remainder 
theorem. Therefore, it is usually required that n has at least one large prime factor p. In this 
case, this problem is known to be difficult, and it is the basis for many cryptosystems, such as 
the Diffie-Hellman key agreement, ElGamal encryption, ElGamal signature scheme, the Digital 
Signature Algorithm. The most effective algorithm is number field sieve, which solves DLP in 

L„(l/3,c + o(l)), 

where n is the number to be factored, c is a constant, and L n (s,c ) = exp(c(log n) s (loglog n) 1 ~ s ) 
[85]. 


— 15— 



1. General Introduction 


RSA-problem The RSA problem is the basis of the security of RSA encryption and signature schemes. 
Let N be a product of two long primes N = pq, and e, d be two integers such that ed = 1 mod (p — 
1) (q — 1). In the RSA setting, N and e is public, but not p, q or d. Then for any integer m, anyone 
can compute C = nf mod N in 0(log(e)) multiplications of bit size 0(log(N)). But given a 
random integer C, it is computationally difficult to find m which satisfy this relation. Unless we 
have the secret key, which is the value of d, then we can compute in with the equation m = C d 
mod N in 0(log(d)) multiplications of bit size 0(log(N)). Solving this problem is suggested to 
be as hard as factoring N. 

Interestingly, although these two problems are considered to be difficult in computation, and no 
effective algorithm was found to solve them in polynomial time, they are not in the NP-complete prob¬ 
lem set either. We do not know what is their exact complexity class. They belong to the NP complexity 
class, because we can verify any potential solution in polynomial time, but there is no evidence that a 
polynomial-time algorithm for finding their solutions exists. 

If large scale quantum computer can be created, DLP and RSA problem can be solved much faster. 
In fact, with Shor's algorithm [90], these two problems can be solved in polynomial time on a suffi¬ 
ciently large quantum computer. This is to say, for crypto schemes whose security relies on DLP and 
RSA problems, an attacker with access to a large enough quantum computer will easily break their 
cryptosystems. Therefore, the approach of the quantum computing era calls for reliable alternate cryp¬ 
tographic systems which remains to be secure against attackers with quantum computers. 


1.4 Lattice-based cryptography 

Lattice algorithms have been used in cryptology since the invention of the LLL reduction algorithm 
in 1982. Its first applications were in cryptanalysis to address problems like finding small roots for 
polynomials in Coppersmith method, and solving subset-sum problems. Later in 1996, NTRU, the first 
cryptosystem based on lattice problems was proposed [47], This system works very efficiently, and is 
incorporated as part of IEEE standard. Meanwhile, new crypto schemes are emerging. The technical 
introduction of lattices will be given in Chapter 2, and here is a brief introduction to some hard lattice 
problems: 

Smallest Vector Problem (SVP) Given a lattice, find its shortest non-zero vector. On the average, if we 
are given a bad basis, the best known algorithm for solving this algorithm is exponential in the 
lattice dimension. There are certain approximation version or variations of this problem, such as 
the GapSVP problem, Approx-SVP problem. The cryptosystems whose security rely on hardness 
of SVP include GGH cryptosystem, NTRU encryption schemes and so on. 

Closest Vector Problem(CVP) Given a lattice, and a point in the Euclidean space, find a vector belong¬ 
ing to the lattice which is closest to this point. This problem is also difficult, unless we have a good 
basis. The cryptosystems whose security rely on hardness of CVP include the learning with error 
problem. 

1.4.1 Security analysis 

The most notable and classical hard problems are discrete logarithm problem and integer factorization prob¬ 
lem. Recently, a lot of new cryptosystems base their security on lattice-related hard problems. For 
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example, the subset sum problem, the closest vector problem etc. Despite its larger key size and inef¬ 
ficiency of computation compared to systems based on discrete-log or integer factorization problems, 
there are a number of advantages: 

Post-quantum cryptography Unlike for integer factorization problem and discrete logarithm problem, 
up to now, there is no quantum algorithm that works more efficiently than classical algorithms 
in terms of solving lattice hard problems. Therefore there is hope that lattice based cryptography 
will survive in the post-quantum era. 

Worst case to average-case reduction When we talk about complexity classes of hard mathematical 
problems, we are usually referring to the hardest instances of those problems. It might happen 
that these hardest instances indeed require large computation to solve, but some other instances 
might be easy. This raises security concerns, as we are unable to guarantee the security of a con¬ 
crete instance, it might fall into the category of easy instances. In practice, most instances of integer 
factoring and discrete logarithm are difficult, but there is no theoretical proof against the possi¬ 
bility of an algorithm solving problems with considerable probability. But some of lattice-based 
cryptography does not suffer from this drawback: following Ajtai's work [4], there is a worst case 
to average-case reduction, which means that if any algorithm can solve random instances with 
non-negligible probability, then it can also be used to solve the hardest instances of another prob¬ 
lem. Later, this reduction has also been somewhat extended to apply to ideal lattice problems [50]. 

1.4.2 Fully homomorphic encryption 

An attractive functionality provided by lattice-based cryptography is the possibility to construct fully 
homomorphic encryption schemes. This is an important advantage of lattice based cryptography, as we 
do not know how to build them from discrete logarithm or integer factoring problems. 

Homomorphic encryption allows specific operations on ciphertexts, without knowing the plain¬ 
texts. Upon decryption, the plaintext of the manipulated ciphertext would correspond exactly to apply¬ 
ing the same operation to the original plaintext. If this scheme allows an arbitrary circuit evaluation, 
then this is called fully homomorphic encryption. The first fully homomorphic encryption system was pro¬ 
posed by Gentry [30], and since then, a lot of improvement was proposed, and recently, an open source 
implementation is also available (HElib [43]). 

1.4.3 Lattice reduction 

Lattices are usually represented by a basis. All non-trivial lattices of dimension > 1 has infinitely many 
basis, and some base are better than others in that they can better reveal essential properties of the 
lattices. With a good basis, we can compute the solution of SVP and CVP with less effort. 

Lattice reduction is the process of transforming any basis of a lattice to a better one. Generally this is 
done in a progressive manner. We first do some week reductions before applying stronger reductions. 
The significance of reduction is two fold. First of all, it yields better bases, on which we can perform 
algorithms like enumeration to compute solutions to problems such as SVP and CVP. In some cases a 
strongly reduced basis already provides an approximate solution. 

Therefore, it is of great importance to study the performance of reduction algorithms, and ana¬ 
lyze how to optimally perform reduction and enumeration to solve SVP and CVP. The significance of 
this study is both in theory and in practice for the security estimates of the majority of lattice-based 
cryptographic primitives. It can be compared to the significance of analyzing integer factoring to RSA 
cryptography. 
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In 1982, LLL reduction was the first polynomial time algorithm which reduces the basis and gives 
an worst-case bound on the output quality of the basis. For many applications, we need stronger ap¬ 
plications, and there comes the idea of block-wise reduction by Schnorr in 1987 [86]. The most popular 
block-wise reduction in practice is BKZ reduction, which was implemented as an open source code in 
NTL library [91]. However, in spite of the application of this algorithm, and its wide-spread usage as a 
benchmark for security estimates for cryptosystems, its practical performance is not well studied. The 
improvement of the BKZ algorithm and its analysis form an important part of this thesis. 

1.5 Structure of this thesis 

The main focus of this thesis is to provide security estimates for lattice based cryptography based on 
the analysis of reduction algorithms. The major work is devoted to the analysis and improvement of 
BKZ lattice reduction algorithm, which spans chapter 2 to chapter 6. Chapter 7 is another independent 
work on a successful attack to a FHE scheme with a classic time-space trade-off trick. 

1.5.1 New security estimates based on improved BKZ reduction 

BKZ reduction is the most practical lattice reduction algorithm up to now. It is frequently used in 
security estimates for various cryptosystems. In spite of its importance and wide popularity, not much 
analysis is given, both for its practical running time and output. Our goal is to understand the behavior 
of BKZ reduction in practice, and thereupon provide improvement to the algorithm. Consequently, the 
security estimates are renewed for many of the schemes. 

The mathematical basics of lattice are presented in chapter 2, where we are also going to go over 
the hard problems and the existing algorithms. The most essential ingredient of BKZ reduction is the 
enumeration procedure. BKZ reduction makes a polynomial time of calls to the enumeration proce¬ 
dure, and the time cost of each enumeration is exponential in the dimension of enumeration. However, 
since the invention of extreme pruning [29], it is possible to tremendously speed up the procedure by 
making it probabilistic. We develop the analysis of pruning and especially we propose an effective way 
to generate pruning strategy for different parameters quickly. The in-depth analysis of enumeration al¬ 
gorithm is given in chapter 3. In chapter 4, we propose a new BKZ algorithm with incorporates several 
improvement, including the previous discussion of enumeration. We also provide a way to simulate the 
average behavior of the BKZ algorithm. With this simulation tool, we further studies BKZ algorithm 
and propose a new BKZ procedure with recursion, which is presented in Chapter 5. The new BKZ algo¬ 
rithm aims to renew many of the security estimates, including the famous NTRU and some of the FHE 
schemes. 

1.5.2 Cryptanalysis of a fully homomorphic encryption scheme 

Chapter 6 describes an attack to a FHE scheme proposed by [22]. The first fully homomorphic encryp¬ 
tion (FHE) scheme proposed by Gentry in 2009 [30] is followed by numerous other proposals, some 
of them using different underlying hard problems. One of FHE scheme makes use of approximate 
common divisor problem (ACD) [97], Especially, [22] proposed a much more efficient variant of ACD 
problem where they choose relatively small size for noise. Unfortunately, this choice overlooked an 
attack using a classical algorithm which finds the secret key with square-root time than brute enumera¬ 
tion. This algorithm uses a time-space trade-off trick, and evaluate several at the same time. Then, this 
method is generalized for various applications, where enumeration has a super-cubic structure. 
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2. Introduction to Lattices 


2.1 Lattices and bases 

There are several ways to define a lattice. Here is the main definition. 

Definition 2.1.1 (Lattice). A lattice L of R'" is a discrete additive subgroup of ( R m , +). 

A lattice is discrete, which means that for each point in the lattice there is a neighborhood which 
contains only the point. Because a lattice is also a group, this implies that there exists a positive constant 
e > 0 such that for any two lattice points vi f vi we have ||vi — V 2 1| > e. Intuitively, a lattice can be 
viewed as a set of points regularly distributed in the space R m . Below is a representation of a lattice in 
R 2 (fig-2.1). 


Figure 2.1: A lattice in R 2 



As a group, a lattice will always include the zero element {0}. A lattice is trivial if it does not contain 
any other element. Otherwise, it contains an infinity of points. For a set of d vectors , vj, if there 
exist d real numbers C \,..., q not all zero such that cqv] + • • • + = 0, then we say that these vectors 

are linearly dependent over R. Otherwise they are linearly independent over R. 

2.1.1 Lattice basis 

Definition 2.1.2 (Rank). The maximum number of linearly independent vectors in a lattice C is the lattice 
dimension or rank. 

In R m , there are at most m linearly independent vectors, therefore the rank of a lattice C in R m is at 
most m. When the rank of L is equal to m, C is called full-rank. 

Definition 2.1.3 (Basis). Let £ be a lattice in R m , and let B = {bi,.. .,b„} be a set of linearly independent 
vectors in C. If for any vector v in C, there exist a set of integers x\,... ,x n such that v = X\bi + • • • + x„b n , 
then B is called a basis of the lattice C. 

We also define span( B) = {cibi + • • • + c„b„|cj G R} to be the vector space spanned by B. When B 
is a maximum set of linear independent vectors in C, then all lattice points v are included in span (B ), 
because otherwise B can be augmented by v. And moreover, there exists a unique representation 
(x \,..., x n ) G R'\ such that £" =1 x,b/ = v. This representation is unique, because B is linearly in¬ 
dependent. 

Theorem 2.1.4 (Existence of basis). A non-zero lattice C has at least one basis. The number of elements of a 
basis equals to the rank of the lattice. 
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Proof. We provide a proof by induction on the lattice rank n, which is a variation of the proof given 
in [68]. Starting by n = 1, a lattice of rank 1 has a basis {v} where v is its shortest non-zero vector. All 
other vectors are necessarily multiples of v because C does not contain linearly independent set with 
size > 1, and because v is the minimum non-zero vector. Assume that the theorem holds for n = k, that 
is any lattice of rank k has at least one basis, and all of these bases have size k. Let n = k + 1, in the 
following we are going to find a basis for an arbitrary lattice C k+ \ot rank k + 1. 

Let C = {ci,..., Cfc, cjt+i} be a set of linearly independent points in C k +i whose size is k + 1. Ac¬ 
cording to the induction assumption, we should already have a basis for C k = C fl span({ ci,..., C/ c }), 
which we call B; = {hi,..., b/,}. Then {bi ,..■ ,b k , C/t+i} is also a linearly independent set. For any 
vector v E C, there is a unique representation (xi, ..., x n ) £ IR" such that Ya =i -T'b, + Xk+\C k +i = v. 
Especially, since the point x,b, is in £*., and B/ is a basis of £/ c , we have VO ^ i ^ k, x/s are all 
integers . Consider the set of all possible values of x k +\. 

k 

S = {x\3Xi s.t. Y^Xibi + xc k+1 £ C} 

i =1 

Clearly S is an additive group which contains at least 0 and 1. For every x £ S, we have ~ 

[x,J )b, + xcjt+i is also in C. Since £ is a lattice, it has finite element in any compact set. This is to say, 

there are finitely many points in C, whose norms are bounded by \J ||cjt+i|| 2 + Y%=\ 11h/11 2 . Therefore 
there are finite elements 0 ^ x ^ 1 which belong to S, and it must contain a minimum positive element 
s > 0, which we suppose is achieved by lattice point vo S C. All other values in t G S are multiples of s, 
otherwise t r = t — [t /sj • s would be a smaller positive element than s. In the following we prove that 
B U {vo} is a basis for C. In fact, for any vector v G C which has a representation over bi,..., b/„ vo as 
(xi ,..., Xk, Xk+i), where x k +\ is some multiple of s, we note x k +\ = k ■ s,k G Z. Then v — k ■ vo should 
be in C k , and can be expressed as integer combination of B/ . Therefore v can be expressed by integer 
combination of vectors in B/ U vo, which proves that any lattice of rank k + 1 has a basis of size k + 1. 
By induction this holds for arbitrary n, which completes the proof. □ 

Remark. A lattice is not a vector space, although they have many points in common. An integer combination of 
lattice points is still a lattice point. But since integers do not form afield, we cannot normalize a vector in a lattice 
as in a vector space. 

All lattices have bases, and reciprocally, given a set of linearly independent vectors, they can gener¬ 
ate a lattice. This follows from the following theorem. 

Theorem 2.1.5. Given any linearly independent vectors bi ,... ,b n G R m , the set 

£({b lr .. .,b„}) = |Xj xibi: Xi £ Z 

is a lattice generated by B = (bi,..., b„). 

Proof. We reproduce the proof given in [68]. It is clear that £(B) is a group, it remains to prove its 
discreteness. Consider the parallelepiped P: 

{ n 

X^T'b, : |x/| < 1 

Since the b,'s are linearly independent, £(B) n P = {0}. Since P is convex and non-degenerated in 
span( B), there exists e > 0 such that the ball B centered at 0 with radius e such that B HP intersect C 
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only on the point 0, by group properties this holds for all points in the set, which proves that this is a 
lattice. □ 


The theorem shows that any set of linearly independent vectors generates a lattice. This gives an al¬ 
ternative definition of lattices which is equivalent to the first one. And in cryptology, the most common 
representation for lattices is basis. 

A basis B = {bi,..., b„ } can also be written in matrix representation, by filling b, its its z-th row: 

/M 


B = 


Here we assume b, to be row vectors. 


\b n 


Definition 2.1.6 (Unimodular matrix). A unimodular matrix U is a square integer matrix having determinant 
1 or —1. 


Theorem 2.1.7. The set of all n-by-n unimodular matrices is GL n (Z), the set of all n-by-n integer matrices 
which are invertible over integers. 

Proof. Let M be an arbitrary unimodular matrix. The entry at z-th column and y'-th row of M 1 is given 
by nij r j = (—l) I+ -7 det(ML)/det(M), where ML is the ( y, i) minor of matrix M. Therefore M is invertible 
over the integers. Meanwhile, for any matrix M that are invertible over integers, we have both det(M) 
and det(M _1 ) are both integers too. Since det(M) • det(M _1 ) = 1, the only possible values for det(M) 
is 1 or —1. □ 


For non-trivial lattices of rank strictly larger than 1, bases are not unique. In fact, if B„ xm is a basis 
of zz-dimension lattice C G R m , and U, 1X n is a unimodular matrix, then B 7 = UB is also a basis of the 
same lattice. 

Lemma 2.1.8. Let Bo be a basis for C of rank n, then the set of all bases of this lattice is {UBo : U G GL„(Z)}. 

Proof. First of all, if B is also a basis for L, then there is a representation with integer coefficients of B 
over Bo, and vice versa. This means that there is an integer matrix U which satisfies B = UBo, and is 
invertible over the integers. Therefore U G GL n (Z). On the other hand, for any U G GL n (Z), define 
B = UBq, then Bo = U 1 B, thus every point in C is an integer combination of B, and B C C, we 
conclude that B is a basis of C. □ 

Given a basis B of a lattice C, multiplying it by a unimodular matrix yields a basis B 7 = UB of the 
same lattice, this is called a basis transformation. Among all the bases of a lattice, the value | det(BB r ) | 
is constant, because unimodular matrices have determinant 1 or —1. 


2.1.2 Lattice volume and Gaussian heuristic 
Definition 2.1.9 (Volume). V\!e define the volume of a lattice C as 

vol(C) = y4iet(BB T ), 
where B is an arbitrary basis of the lattice C. 

Geometrically, this is the volume of the parallelepiped spanned by the basis B. The volume of a 
lattice measures the average density of a lattice, in the following sense. For any v G C, we define the 
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parallelepiped P(B) = {X" = i Xjbj : —1/2 < Xj ^ 1 / 2 }, then the volume of this parallelepiped equals to 
vol(C). Meanwhile, we have 

U (P(B) + v) = span(B), 

veC 

and 

Vvi 7 ^ V 2 e £,P(vi) n P(v 2 ) = 0 , 

which shows that this is a tiling of the space span(B). This means that in span (B), the average space 
occupied by a lattice point is vol(C) on average. From the point of view of group theory, this paral¬ 
lelepiped is simply the quotient group R "/£. Furthermore, for any measurable set S £ span(B) with 
volume vol(S), it is intuitively expected to contain approximately vol(S) / vol(C) such parallelepipeds, 
therefore heuristically this should also be the number of points in S fl £. This estimate for number of 
lattice points is called the Gaussian Heuristic. 

Heuristic 2.1.10 (Gaussian heuristic 1). Let Cbe an dimension lattice in R", and She a measurable subset of 
R” with finite volume, the Gaussian Heuristic predicts that the number of lattice points in S: 

#{S n £} ~ vol(S)/vol(C) 


But being a heuristic estimate, it can be very close to the real answer, or we can deliberately create 
some sets that make the estimate arbitrarily far from truth. For example, the lattice £ = Z” has volume 
1 , and the set 


S = 



/4 < xx < 3/4, 




has volume a" _ 1 /2, but it does not contain any lattice points. But if we restrict S to have certain shape, 
then we will be more confident about this estimate. Intuitively, when S is convex and big enough, then 
the estimate should be reasonable. Especially, when S is a sphere, we have the following lemma. 


Lemma 2.1.11. Let £ be a n dimensional lattice, and v n be the volume of the closed unitary hyper ball S of 
dimension n in span(C). For any r > 0, the number of lattice points v £ S H £ such that \ £ £ is noted as 
sc(r). Then, 

lim — -A- = voliC) 

r—>oo Sjr[r ) 

Foiling the Gaussian heuristic, one can obtain the following intuitive of Ai (£) where £ is a random 
lattice. The formal definition of random lattices will later be presented in section 2.3. In a random 
lattice £ of large dimension, the ball of radius (vol(£)/v n ) 1/n is expected to contain a single point. 
Therefore X\{£) is expected to be around this value. In fact, in chapter 3, we will arriva at a more 
precise evaluation of E(Ai) which is given by the following theoreme. 


Theorem 2.1.12. For random lattices £ of dimension n, when n tends to infinity, zve have 


E(A 1 (£)) = (l- 1 /n)^ 
where 7 fh 0.577 is the Euler-Mascheroni constant. 


f2vol(£)\ 1/n 
L ) 


( 2 . 1 ) 


Remark. A common approximation to E( Ai(£)) for random lattices is , which zve note as GH. 

According to this theorem, the difference between GH and E(\\(£)) is expressed by the following formula 

(2 vol{£)\ 1/n 


E(Ai(£) =(1-7 In) 


V v n J 


fl + —). GH . 
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2.1.3 Sublattices and projected lattices 

The group property and the geometric structure of lattices allow us to decompose lattices in different 
ways. When the dimension of a lattice is big, it is interesting to look at its sublattices and projected 
lattices to explore its hidden structures. 

Definition 2.1.13 (Sublattice). Let £ be a lattice in R m . A sublattice of £ is a lattice £' included in £. 

When £' is a sublattice of £, then all its basis vectors B' G £' can be expressed by integer combi¬ 
nations of basis vectors of £. This means there exist an integer matrix T such that B' = TB. On the 
other hand, for any d-by-n integer matrix T, if T is of rank d, then TB is a set of d linearly independent 
vectors, thus /3(TB) is clearly a sublattice of /3(B). Especially, when the rank of £' is equivalent d and 
vol{£( TB)) = vol{£( B)) • Vdet(TT r ). 

When d = n, then the sublattice is full-rank. Especially, if T is a unimodular matrix, then it de¬ 
generates into the trivial case of £' = £. When d < n, one particular interesting case is the primitive 
sublattice. 


Definition 2.1.14 (Primitive sublattice). A sublattice £' of £ is called primitive if and only if for any basis 
(bi,... ,b r ) of £', there exist (b r+ i,.. . ,b n ) G £ such that (bi,... ,b„) is a basis of £. 


For a lattice /3(B) defined by the basis B = (bi,... ,b„), we can naturally obtain a subset of its first r 
basis vectors B r = (bi,...,b r ) with r < n, and /3(B r ) is a primitive sublattice of /3(B). This is noted as 
/3(B r ). In general, for any primitive sublattice £' of £, we have £J = £ fl span{£'). Conversely, for any 
a subspace E C R m and £ generated by basis B, if the set B' = {b, G B : b; G E} is linearly independent, 
then £! = /3(B) n E is a primitive sublattice generated by the basis B', which can be augmented to a 
basis of £ by {b, G B : b, ^ E}. 

Apart from sublattices, another way to lower the dimension of the lattice is by projection. The group 
property is always preserved by projection of a lattice, but there is no guarantee for the discreteness of 


the projected lattice. In fact, let B = 


1 0 \ 

i J be the basis for lattice £, then its projection over the first 

coordinate is not discrete, because the set Z + \/2Z is dense over R. 

In order to ensure discreteness, we have to restrict the subspace corresponding to the projection. 
In the previous example, the projection of the lattice is not discrete because both its basis vectors are 
projected into the subspace, and it falls into the unfortunate case where no integer combination of these 
two projected vectors can be zero. Intuitively, when we project lattice £ into a subspace of span{£) of 
smaller dimension, then its rank would lower down accordingly, and we hope that some of its basis 
vectors would be projected to zero. For example, this would be the case if we project orthogonally a 
lattice over the orthogonal supplement of a primitive sublattice. 


Lemma 2.1.15 (Projected lattice). Let £be a n-rank lattice in R m , and £' be a d-rank primitive sublattice of £: 
1 ^ d £ n. Let tc c denote the orthogonal projection over span(£ , ) J ~. Then n £,>(£) is a lattice of rank n — r. 

Proof Suppose B, = {bi,..., b r } is a basis of £! which can be completed into a basis of £ in the form 
of B n = B, U {b r _|_i,... ,b„}. Then kqj{£) is the lattice generated by the basis {7T£/(b r +i),..., b„)}, 

where tc^ (b t , Mr < i £ n ) are linearly independent vectors. □ 

Very often, when we define a lattice with a given basis B, the order of the basis vectors {bi,..., b„ } 
is also fixed, and this order is very important. Then its primitive sublattice /3(B r ) : r < n, and the 
projected lattice TCn^AL 3) decompose the lattice into two parts in orthogonal subspaces span( B,-) and 
span (B r ) x respectively. Especially, for integers i,j such that 1 ^ i £ j £ n, we can define the projection 
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of the sublattice By over the orthogonal supplement of spanfBi), and as a result the projected sublattice 
71 a b : (By) is used to disclose the "local structure" of a lattice much like slicing the lattice at indices i and 
j. This decomposition is very useful in many applications. 

For convenience of notation, when the basis is clear in the context, we will use 7r, to mean the 
projection 7T£( B . p for 1 < / ^ n, and by convention we use U\ to mean the identity We will also use 
By to mean the basis {tt z - (b z ),... , zr,- (by) }, therefore the previous projected sublattice Ti Ci , (By), 1 ^ 
i ^ ^ n can be denoted by C ( B yyyj). 


2.1.4 Gram-Schmidt orthogonalization 

Unlike in the vector space R" where we can always find orthonormal bases, for lattices, the typical 
situation is that there is no orthogonal basis. But we still have a Gram-Schmidt Orthogonalization 
process. Given a set of linearly independent vectors B = {hi, ..., b„}, they produce a set of vectors 
{bj,..., b*} which are orthogonal between each other. But with lattices, there are two main differences 
compared to vector spaces: 

1. Apart from bp the vectors bf ..., b* usually do not belong to the lattice. 

2. Their norms ||bj ||,..., ||b*|| are not unitary, and are not equal in general. 

Definition 2.1.16 (Gram-Schmidt Orthogonalization). Let {bi,... ,b„} be linearly independent vectors in 
R m . The Gram-Schmidt Orthogonalization (GSO) is the orthogonal family (bp... ,b*) where b* is defined as 

TCi(bj). 

The vectors in the orthogonal family can be computed recursively as follows: 


i —1 

b* = b Pi,jb*, where p irj 
i =i 



for all 1 ^ j < i ^ n 


Each b* is obtained by eliminating from by the component in span (bi,... ,by_i). Actually, this pro¬ 
cess can also be described as multiplication of an elementary matrix. Consequently, the orthogonaliza¬ 
tion process can be described in terms of decomposing the basis B into a product of two matrices: 


/ bl \ 

b 2 

t>3 

\t>„/ 


/ 1 0 

Pi, l 1 

7*3,1 p 3,2 


\j^n, 1 }^n,2 


o 

o 

i 



( 2 . 2 ) 


Here, the matrix M = (pi r j)o<i^ n ,o<Kn is lower-triangular with 1 on its diagonal, and the row vectors 
in (b^,.. ,,b*) are orthogonal to each other. In fact, we can further normalize the vectors b*. This 
operation can also be described as a product of two matrices. As a result, the basis is decomposed into 
a product of three matrices: the matrix M, a diagonal matrix D and an n x m matrix O. 
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The matrix D is a diagonal matrix of rank n, so normally it suffices to store the vector (11 b* 11 ,..., 11 b£ 11 ), 
which saves space. 

The rows of the matrix O are orthonormal vectors, which means a rigid rotation from the canonical 
basis of R'". When n < in, this signifies throwing away the last in — n axes after the rigid rotation. 
Therefore, up to a rotation, all lattices of rank n are congruent to some full-rank lattices of rank n. For 
example, £(B) is congruent to the lattice £(M • D). 

In fact, it is easy to see that (MD) • (O) is a QR-decomposition of B, and when given a QR- 
decomposition B = QR, then we can compute the GSO decomposition by taking O = R, the diagonals 
of D to be the same as the diagonals of Q, and each entry of M to be 


f 0 )>i 

Kj = \ 1 i = i 

U j< i 


This triangular decomposition of B makes it easier to analyze the structure of the basis, especially, 
for the basis of the projected sublattice B ,^. In fact, B^ can be expressed as the product of truncations 
of M, D, and O: 
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Whereas the volume of B is 
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Therefore, with the GSO decomposition M, D and O, we also get the representation for all B^y] 
simultaneously. More generally, when we have a set of vectors B = {bi,..., b; } E R"' which generates 
a lattice C of rank n, n ^ Z ^ m. We will have the following decomposition. 




( 1 

0 

0 

0 \ 






i l 2 ,1 

1 

0 

0 




/bA 


F3,l 

F3,2 

1 

0 


/ IIKII 0 . 

.. 0 \ 


/b*/||b*||\ 

b 2 







0 ||b?|| 



K/MW 



J^n, 1 



Hn,n —1 1 






UJ 





Hn+l,n 


V o 

\\K\\) 


\K/\\KWJ 



\Hi 

Hi ,2 


Hl,n Hl.n / 

1 




where P is a l x l permutation matrix. The first n elements of P • B is a basis of C. This gives an algorithm 
that computes a basis from a generating set of C. 

In addition, this also shows that it is easy to distinguish if a vector v belongs to a lattice C. If v is an 
integer combination of B, which means that there exists an integer vector x such that v = xB, then the 
equation v = xMDO must have an integer solution. Multiplying O r • D 1 on both sides we get: 


v O r D 1 = xM. 
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Here, O r is the transpose of O such that OO t = I n/ „, where I, l/n is an identity matrix of dimension n x n, 
and the inverse of D is given by diag(\\b \|| -1 ,..., ||b* || “ 1 ). And M is already a lower-triangular matrix, 
thus the solution of this system should be straight-forward. This procedure also solves the following 
problem: Given a vector basis B of rank n and a point v in the lattice £(B), determine the integer vector 
x = (x\,... ,x n ) E Z n such that v = xB. 


Algorithm 1 Determine membership and find the decomposition 
Input: The basis of a lattice B, and a vector v 

Output: Determine whether v E /3(B). If so, compute x E Z" such that v = xB if v E £(B). 
l: Calculate the GSO of the basis [M, D, O] decompose(B). 

2: Solve the linear system vO D 1 = xM. 

3: if v = xB and x E Z" then 
4: return (yes,x) 

5: else 

6: return no 

7: end if 


2.2 Hard lattice problems 

One of the great advantage of lattice-based cryptography is its hardness guarantee. This hardness lies 
in two facts. First of all, many lattice problems are known to be NP-hard, whereas for integer factoring 
and discrete log, we do not yet know their concrete complexity class. Secondly, thanks to a worst- 
case to average-case reduction, we are able to define a lattice problem over a class of hardest instances, 
such that its hardness guarantees the hardness of many lattice problems over random lattices. This 
means that for many lattice problems, if an algorithm is able to solve any instance with non-negligible 
probability, it also solves hardest instances with high probability. 

Most of these hard problems are related with short vectors in the lattice. In particular, two problems, 
the shortest vector problem and the closest vector problem have a central position in all hard problems. 


2.2.1 Shortest vectors 


The norm of a shortest nonzero vector in the lattice is also called Ai (C). This is the minimal distance. If 
we consider lattices with fixed volume vol(C) = 1, then Ai (£) can be arbitrarily close to 0, for example 

^0 


in the lattice generated by the basis 


0 1/e 


, there exists a vector of norm e which can be arbitrarily 


small. Conversely, it turns out that there is a maximum value for X\{C) among all lattices of fixed 
volume. 


Definition 2.2.1 (Hermite's constant). The supremum of \i(£) 2 /vol(£) 2/n over all C of dimension n is 
denoted by y n , and is called Hermite's constant of dimension n. 

It can be proved that this supremum is always reached by one or some lattices. The exact values of 
7 „ is very difficult to find, and they are only known for 1 ^ n ^ 8, and for n = 24. For these dimensions 
the lattices reaching this Hermite's constant are also known. 

For general n, there are already asymptotic bounds. For example, Minkowski gave an upper bound 
of 7 „ with is linear in n. 
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In < 1 + n /4 (2.3) 

This is a consequence of the following theorem. 

Theorem 2.2.2. (Minkozvski's Convex Body Theorem) Let C be a full-rank lattice of R' ! . Let She a measurable 
subset of R' ! , convex, symmetric with respect to 0, such that vol(S) > 2" • vol(C). Then S contains at least one 
non-zero point of £. 

Proof. The proof of this theorem uses Blichfeldt's lemma, which states that for a measurable set S with 
vol(S) > vol(C), there are at least two different points x, y G S such that x — y G £. [66] Now, consider 
the lattice 2£ = {2v : v G £}, it has volume 2” • vol(C), thus S should contain two distinct points 
x, y G S such that x — y G 2 £. And by symmetry — y also belongs to S, and as a result of convexity 
(x — y) /2 G £ is a non-zero lattice point. □ 


Definition 2.2.3 (n-ball). A n-ball of radius r is the generalization of a closed ball in dimension n, defined as 
B n (r ) = {Ya=\ x j ^ r 2 }- Us volume is 


v n (r) = 


7T 
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■ r 


Using Stirling's formula, we can establish the bounds: 
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By default, we mean r = 1, which refers to a unit n-ball. 


Let C be an arbitrary lattice dimension n, for any n-ball B n (r) such that v n ( r) > 2 n ■ vol (C ), according 
to Minkowski's convex body theorem, B n (r) contains a non-zero vector of C. This implies that A| (C) £ 
r. Since this holds for arbitrary C of dimension n, we obtain 


T^ 



^ — • (27rn) 2/,! < 1 +n/4 
ne 


The definition of A i (£) is straightforward, it measures the minimal distance between two lattice 
vectors. On the other hand, this gives only information in one direction, see fig. 2.2. If we want to 
explore the next closest distances to other neighbors, we can similarly define A 2 (£),.... But A 2 (£) is 
not simply defined as the second shortest vector in the lattice, because in some cases it will produce 
parallel vectors to a shortest vector, which is the illustrated case in fig. 2.2. In general, we hope that all 
vectors having norms £ A £ / could span R rf . This is described in the following definition initially given 
by Minkowski. 


— 28 — 



2.2. Hard lattice problems 


Figure 2.2: A lattice in R 2 which has big A 2 (£)/Ai(£) ratio 



Definition 2.2.4 (Successive Minima). Let £ be a lattice of dimension n. For i = 1,...,«, the i-th successive 
minimum is defined as 

A;(£) = min{B„(r) n £ contains at least i linear independent vectors} 

Clearly, the minima are increasing: Ai(£) ^ A 2 (£) ^ ^ A„(£) 

Similarly to the upper bound for Ai(£) given by Minkowski, there exists an upper bound for suc¬ 
cessive minima. 

Theorem 2.2.5 (Minkowski's Second Theorem). Let Cbe a lattice of rank n, we have 

( \ 1 In 

flA ; (£)j ^ Tn-vol(C ) 1/n . (2.4) 

The interested reader is referred to the book [93] for a proof of this theorem. One interesting re¬ 
mark is that for //-rank lattice £, although we can find a set of linearly independent vectors reaching 
Ai (£),..., A ,1 (£) simultaneously, however, this set does not necessarily form a basis. In fact, as soon as 
n = 5, there is a counter-example. Consider the lattice generated by the following basis, 

tl 0 0 0 o\ 

0 2 0 0 0 

0 0 2 0 0 . 

0 0 0 2 0 

yl 1 1 1 ly 

All of its maxima are 2, but it has no basis reaching all these maxima. 

2.2.2 Shortest vector problem 

In practical settings of cryptology, one often deals with integer lattices. Here is the formal definition of 
SVP which serves as the basis for many cryptographic primitives. 

Definition 2.2.6 (Shortest Vector Problem, SVP). Given a basis B of the lattice £ C Z m ,find a nonzero vector 
v in £ such that 

Mull = min II v II. 
vc£!B),:;v;/o 


— 29 — 







2. Introduction to Lattices 


This is also called the exact SVP problem, in comparison with several approximate versions of SVP. 
The problem does not state precisely what the input basis should be like. But obviously B should not 
contain short vectors that are already very close to the answer, for example, for full rank lattices, bases 
are usually given in HNF in this problem. 

The exact SVP problem is known to be NP-hard under randomized reductions [5]. In many crypto¬ 
graphic applications, we are also interested in approximate versions of the problems. For example, we 
can relax the target vector by a factor of 7 . 

Definition 2.2.7 (Approx-SVP 7 , ASVP 7 ). Given a basis B of the n-dimension lattice L C Z m and a factor 7 , 
find a non-zero vector v in C such that ||v|| 7 7 • Ai(£(B)). 

In practice however, we usually do not know the value of Ai(£) in advance, it is sometimes even 
difficult to judge if a vector is a solution or not, although we have asymptotic bounds on A\(C). On 
the other hand, if the bound for the norm of the target vector only depends on the volume of the lattice 
vol (£), it will be much easier to decide if a vector belongs to the solution set or not. 

Definition 2.2.8 (Hermite-SVP 7 , PISVP 7 ). Given a basis B of the n-dimension lattice C, C Z”' and a factor 7 , 
find a non-zero short vector v in C such that |jv|| 7 7 • vol(C) 1/n . 

In a lot of cryptographic applications, the lattices have a much shorter vector than the average Ai (£) 
for random lattices. It motivated to study if the problem of finding short vectors in such lattices would 
be much easier than the original SVP problem. 

Definition 2.2.9 (Unique-SVP 7 , uSVP 7 ). Given a basis B of the n-dimension lattice C C Z m such that 
A 2 (£) > 7 Ai(£), find a shortest vector of £(B). 

The proof of the hardness results of these problems make use of the gap-version of the lattice prob¬ 
lem [49]. In general, a gap-version GAPX 7 of an optimization problem X is a promise problem where 
the instance is guaranteed to either have a good optimum (the YES instances) or is far from it (the NO 
instances). The ratio between the optimum value in the YES and the NO cases is at least 7 . 

Definition 2.2.10 (GapSVP 7 ). Given a basis B of the n-dimension lattice C and a factor 7 . GapSVP 7 is a 
promise problem whose 

- Yes instances are all lattices satisfying \\{C) < d, and 

- No instances are all lattices satisfying \\{C) > 7 • d. 

2.2.3 Closest vector problem 

Definition 2.2.11 (Closest Vector Problem, CVP). Given a basis B of the lattice C G Z'" of rank n, and a point 
x G Q m ,find a lattice vector v which is closest to x: 

v = are min 11 w — x 11 . 

we£(B) 

Similar to the case of SVP, there is an approximate version for CVP: 

Definition 2.2.12 (Approx-CVP 7 , ACVP 7 ). Given a basis B of an n-dimensional lattice C C Z” 1 and a factor 
7 , and a point x G Q ”,find a vector vg£ such that ||v — x|| ^ 7 • dist(C,x). 

The CVP is reminiscent of a problem in coding theory, which serves as basis for the security of linear 
codes. 
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Definition 2.2.13 (General decoding problem for linear codes). Let C be a ( n,k ) linear code over a finite 
field F and y G F", find x E C such that the distance between (x,y) is minimal. Here, a ( n,k ) linear code is a 
k-ranked linear subspace of the vector space F”. 

For many cryptographic applications, the lattice, usually served as the public key, is fixed and pub¬ 
lished for a relatively long period of time, only the point x which is the ciphertext varies as an input. 
The adversary is allowed to do any sort of preprocessing and store a polynomial amount of information 
of the lattice, and the question is whether in this scenario CVP is still hard to compute. 

Definition 2.2.14 (CVP 7 with Preprocessing, CVPP 7 ). Given a basis B of an n-dimensional lattice C C Z m 
and a factor 7 , one is allowed to do arbitrary preprocessing and store polynomial amount of information. Given a 
point x G Z”, the computation problem consists of finding a vector v G C such that ||v|| ^ 7 • dist(£,x). 

Both these problems have been proved to be NP-hard in their exact form [8,26,98]. A lot of cryp¬ 
tosystems using these hardness results were proposed. For example, Goldreich-Goldwasser-Halevi 
(GGH) cryptosystem [37] is based on hardness of CVP. And McEliece cryptosystem is an application 
for the linear coding problem [61]. The GGH encryption challenges have been solved in [67], while the 
GGH signature is broken by Nguyen [70] . 

According to the work of Arora et al. [7], CVP is NP-hard to approximate within any constant factor, 
while Micciancio [62] has shown that SVP is hard to approximate within any constant factor less than 
\Jl. In fact, Goldreich et al. [39] gave a Turing reduction from SVP to CVP, showing that any hardness 
for CVP implies the same hardness for SVP (but not vice versa). As Regev [3] pointed out that there 
exist c > 0 such that GapCVP fv7 y is in NP Cl coNP, accordingly, we have GapSVP Cv7; is in NP fl coNP too. 
But it cannot be N P-hard to approximate under randomized reductions unless polynomial hierarchy 
collapses [3,36,42]. The complexity of AS VP [27,80] is summerized into graphic representation by 
figure 2.3. 


Figure 2.3: ASVP 7 hardness bar (some constants are omitted) 
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2.2.4 Worst-case to average-case reduction 

In 1996, Ajtai established the first worst to average reduction for lattice problems [4], This ground¬ 
breaking work established a connection between the worst-case and average-case complexity of certain 
lattice approximation problems. 

One basic requirement in cryptographic settings is to generating instances of problems that are 
difficult to solve. For instance, problems which are proved to be NP-complete are unlikely to have 
polynomial-time solutions. Alternatively, if a problem has been very famous for a long time, and no 
efficient efficient algorithm is discovered in spite of tremendous efforts from the research community, 
this problem can be regarded as a difficult problem as well. But in both these cases, the difficulty of 
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the problem often refers to their worst case instances. If one is supposed to generate a large amount of 
instances where each one of them are expected to be difficult, then one will need an additional proof. 
Namely, one need a reduction from the worst case of the hard problems to the average case of a class 
of problem, from which he can conveniently generate instances suitable for cryptographic applications. 
Indeed, Ajtai proved that for a certain class of SIS problems, if one can provide a polynomial time al¬ 
gorithm which solves it with non-negligible probability, then it also solves almost every instances for 
some of the famous problems which are known to be difficult in the worst case. 

More precisely, for the following problems, 

• Find Ai (£) approximately up to a polynomial factor; 

• uSVP„i for any sufficient large constant c; 

• Find a basis b \,..., b„ such that max " =1 ||b, || is smallest possible up to a polynomial factor; 

Ajtai proved that if any of these problems there is no polynomial time probabilistic solution with a non- 
negligible probability, then a large class of SIS problems have no polynomial time probabilistic solution 
either. Later on, Regev et al. [13,79] also proved that a class of LWE problems also enjoy this worst to 
average reduction property. These two problems are presented in the following subsections. 

2.2.5 Short integer solution 

The Shortest integer solution SIS is the first problem presented in the work of Ajtai [4], worst case to 
average case reduction to build a one way function. Later it is also used in many other applications such 
as collision resistant hash function [38], digital signature schemes [18,35], and identification schemes 
[55], 

Definition 2.2.15 (Short integer solution, SIS). Let n and q be integers, where n is the primary security 
parameter, and let f> > 0. Given a uniformly random matrix A E Z" xm ,for some m = poly(n ) , the problem 
asks to find a nonzero integer vector z E Z m such that Az = 0 mod q and ||z|| < f>. 

Although this problem is not presented in terms of lattices, it can be regarded as finding vectors 
shorter than f> in the lattice defined by C = {(y : Ay = 0 mod q)}. 

For any q that is at least some polynomial in n, solving the SIS problem implies a solution to standard 
worst-case lattice problems [82], The work by Micciancio and Peikert [63] showed hardness results for 
SIS with small parameters: Let f > foo > 1 be reals, Z = {z E Z m : ||z|| ^ f and |jz||oo ^ foo}, and 
let the q ^ f> ■ nf for some constant 5 > 0, then solving SIS with parameters n,m,q and solution set 
Z\{0} is at least as hard as approximating lattice problems in the worst case on //-dimensional lattices 
to within a factor 7 = max{l, f> ■ foo/q} • 0(f>\/n). 

2.2.6 Learning with error 

learning with errors (LWE) is a class of very important problem, proposed by Regev [64,81]. This prob¬ 
lem has extremely versatile cryptographic applications, such as identity based encryption [2,35], fully 
homomorphic encryption [11,12,14], and so on. It can be viewed as a dual problem to the SIS problem. 
The definition of the search version of LWE is given as follows. 
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Definition 2.2.16 (Learning with errors, LWE). Given a modulus q, a rank n, and an "error" probability 
distribution x which depends on a parameter a G (0,1), let A S/X on Z” x Z t? be defined asfollozvs: 

/ ai,ai -s + ei \ 

a 2 / a 2 • s + ei 

. / 

a ;» ' s 3- e m ) 

where a, G Z^ are vectors uniformly chosen at random, and e t G Z ? are chosen according to x and additions are 
made in Zg. An algorithm solves LWE with modulus q and distribution x if, for any s G Z”, given an arbitrary 
number of independent samples from A S/X , it outputs s zvith non-negligible probability. 

The LWE problem can be informally described as to recover a secret s given a sequence of "approx¬ 
imate" random linear equations on s. In the original LWE article [81], x is the integral rounding of of 
a continuous Gaussian distribution. That is, x is the distribution of the random variable [qX\ mod q 
where X is a continuous centered normal variable with standard deviation cl/ \Jln reduced modulo 1 . 
The Lindner-Peikert article [53] uses instead the discrete Gaussian distribution x = Dz m A q- 

Definition 2.2.17 (Bounded distance decoding, BDD 7 ). Given C and a target v such that dist(C,v) < 
r )Vol{C) x,n , BDD asks to find u G C satisfying ||u — v|| < 7 vol(C) 1/n . 

LWE problem can be regarded as a bounded distance decoding BDD problem on a certain family of 
lattices. For A G Z” xm , define the lattice 

C(A l ) = {zG Z m : 3s G Z”,z = A 1 s mod q}. 


Then 

v = A f s + e mod q 

is the target point which is of distance ||e|| to the lattice C. 

The first hardness result of LWE [79] showed that LWE problem with ixq X 2yTz is at least as hard as 
quantumly approximating some of the worst case lattice problems on n-dimensional lattices to within 
0(n/cc) factors. First results in classical reduction was established by Peikert [74], Lyubashevsky and 
Micciancio [56] stating that LWE with exponential modulus is as hard as some standard lattice prob¬ 
lems using a classical reduction. A recent work by Brakerski et al. [13] proved the hardness result for 
polynomial modulus LWE. 


2.3 Random lattices 

One of the most important applications of lattice in cryptology is to provide computationally hard 
problems that can be used to create trap-door functions. However, when we say that a lattice related 
problem is hard, it is important to specify which category of lattices and what basis we are given. In 
fact, even for the problems which are considered to be difficult, there exist special lattices for which the 
problems are easy to solve. 

In the most general definitions, we allow C G IR m which means that the coordinates of vectors can 
take any value in lR m . As a matter of fact, so far it is still very difficult for computers to manage irrational 
numbers in general. There are usually two solutions. Either we can use floating point numbers for 
approximation, where we have to be careful to choose a working precision and be aware of errors. Or 
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by restricting the basis vectors to be rational , we can actually multiply by a suitable constant to the 
basis matrix to make it an integer matrix where exact computations can be performed . 

In this section, we restrict our discussion to full-rank lattices. We can do this because any C G IR"' of 
rank n is congruent to a full-rank C G R” with a proper rotation. 

2.3.1 Random lattices in IR ' 1 

Up to scaling, all lattices can be normalized to have volume 1. Thus, let X n be the set of full-rank lattices 
of rank n modulo its volume: 

X n = {£ G E" : vol(C) = 1} 

Lemma 2.3.1. X n is isomorphic to SL f! (R)/SL, ; (Z). 

Proof. Let us denote Co = C((e i,..., e n ) T ) the lattice in X n generated by the canonical basis of R' ! . By 
the definition of X n , we have a transitive group action of SL„(R) on X n , i.e. the map 

SL„(R) —^ X n 

T^TX(:=/:(T-(ei,...,e,. ; ) T )). 

is surjective. Hence by the classical result in the theory of group action on topological spaces, we know 
that 

X n = S L„(R)/Stabilizer (£q)- 

It suffices to show that Stabilizer (Xq) = SL„(Z). As was proved in lemma 2.1.8, GL„(Z) = {T : 
£(TB) = £(B)}, which means that GL„(Z) is the set of all possible matrices which transform one basis 
into other bases of the same lattice. Thus the set SL„(Z) = SL„ ( [R) n GL„(Z) is the stabilizer of the 
group action. This completes the proof. □ 

In particular X n is a homogeneous space of SL„( IR). Hence on X n , there is a translation invariant 
measure which is unique up to scaling, known as the Haar measure. In fact, the Haar measure is well 
defined for any locally compact groups. Then, since SL„(Z) is a closed subset, the quotient group 
SL„(R)/SL,i(Z) also has a Haar measure which is also unique up to a scaler. It is proved the Haar 
measure on this quotient group is finite, and can be normalized to be a natural probability measure, 
which is called Haar probability measure [99] which we call p. With this probability measure, we can 
define what a random lattice is. In theory a random lattice in X n is a sample from X n according to the 
measure p. 

2.3.2 Random integer lattices 

For integer matrix, the set of all lattices with a fixed volume V: X n (V) = {X C Z"' : vol(C) = V} is 
finite, thus the notion of randomness is straight forward. It consists of choosing one lattice uniformly 
from the whole set X n (V). 

Meanwhile, for an integer lattice C of volume V of dimension n, when we normalize its basis vectors 
by a factor of 1/ l ^', its entries become rational numbers, and when we take increasing value of V , we 
would hope that the behavior of uniformly random integer lattices will approach the uniform random 
lattices in X„. Indeed, this is given by following theorem proved in [40]: 

Theorem 2.3.2. X n (V) is uniformly distributed in X n in the following sense: Given any measurable subset 
A C X„ whose boundary has zero measure with respect to p, the fraction of lattices in X n {V) that lie in A tends 
to p(A) as V tends to infinity. 
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This theorem states that, when we increase the value of V, the uniform distribution over X n (V) 
approaches the uniform distribution over the X n set of all lattices modulo volume. This is similar 
to reducing quantization error in signal processing: we increase the digits in our representation to 
approach the real value. 

However, in many applications, we need to be able to generate a random instance of the lattices 
with a given rank n. Although for a fixed V G Z, the size of X n (V) is finite, yet it is not clear how to 
sample efficiently from X n (V). 

In order to know the elements in X n (V), we present the lattice with basis in Hermite normal form. In 
fact, we shall see that an integer lattice has exact one unique basis of Hermite normal form. 

Definition 2.3.3 (Hermite Normal Form, HNF). The basis matrix B of a full-rank integer lattice C is in 
Hermite Normal Form if 

1. B is lower triangular, 

2. VI ^ ^ n, bj r i > 0, and 

3. VI ^ i < j ^ n, 0 ^ bj r i < b tr j. 

Theorem 2.3.4. For an integer lattice C, there exist one unique basis B in Hermite Normal Basis. 


Proof. To show the existence of such a basis we give a constructive proof. Given any integer basis B, 
first of all, we show that we can find an equivalent basis that is lower triangular, with positive elements 
on its diagonal. Then based on this basis we can find a basis in HNF. 

To eliminate the zeros in the upper triangular, we start from the last column to the 2nd column, 
and for each column, we try to eliminate the non-zero element in the upper diagonal part one by one. 
Suppose at some point we arrive at the following situation: 


/ * * 0 0 ... 0 \ 

* * : : : 

* * 0 

bj, i • • • bj,i 0 


bi \ ... bi r i 0 


\ ^«, 1 bn,i 


0 

bn,n J 


where both bjj and fcy, are non-zero. We will perform substitution and swap operations to eliminate the 
non-zero element in bjj. Notice that there must be some non-zero element on z'-th column, otherwise 
the matrix is degenerated and the lattice is not full-rank. And in fact, by substituting by with — b ; or 
b j + xbj, x E Z, the represented lattice is not changed. Therefore we do the following algorithm: 
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Algorithm 2 Compute the Hermite normal form 

Input: A basis B of a full-rank integer lattice. 

Output: The HNF of this lattice. 

1: for i = n, n — 1,..., 2 do 

2: if b hi = 0, find k such that b^i f 0 and swap(b„ b/ c ). 

3: VI ^ j ^ n, if bj r i < 0 then b ; -b ; . 

4: for j = 1,..., i — 1 do 

5: while fcy / > 0 do 

6: if bjj < b hl , swap (b;, by). 

7: by G- by — L^V&db/, where |_xj means the biggest integer in (—oo,x], 

8: end while 

9: end for 

10: for ; = i + 1,..., n do 

11: by i by I bjj j bjj | b/ . 

12: end for 

13: end for 


One might notice that the while loop in line 5 — 8 is very similar to the Euclidean algorithm for 
calculating the greatest common divisor. Therefore the proof of termination is obvious. In addition, 
each time we finish a while loop, we create one more zero entry in the matrix, therefore the while loop 
will be executed at most 0(n 2 ) times. 

For the proof of uniqueness, we suppose that otherwise there exist two distinct basis of HNF B and 
B' for the same lattice C. And let i be the smallest row index such that b, f b •, and let j be the biggest 
column index such that bjj f=- Kyi V i. As b, and b ■ belongs to C, thus any integer combination of them 
belongs to £ too, especially one of their combination b has value d = gcd (ly y, b\ ) on its j-th component, 
and this is b's last non-zero component by the choice of j. Therefore d must be some multiple of bjj, 
because of the lower-triangular structure of the basis matrix. This means that bjj ^ d ^ bjj, which 
violates the property of basis in HNF, therefore absurd. □ 


Remark. The one-to-one correspondence relationship between integer lattice and basis in HNF facilitates the 
sampling of random integer matrices. In fact, it suffices to enumerate all possible basis in HNF to list all integer 
lattices. Besides, when presenting basis in HNF, it is easy to see that the volume of the lattice is the product of all 
the diagonal elements. 

The situation is even simpler when V is a prime p. In this case, the elements in X ri ( V) takes one of 
the following form, according to different possible position of p in its diagonal: 
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we say a basis B such that £(B) G X n (p) is of category i if its z’-th diagonal element is p. Category 
1 has p n 1 elements, while the total number of the rest categories sums up to f p l . When p is large 
enough, compared to category 1, other categories form a negligible minority. Indeed, the following 
theorem, attributed to [40] again, confirms that category 1 matrices alone is uniform distributed, as p 
grows to infinity. 
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Theorem 2.3.5. Let p be a prime number, and x n be numbers randomly and independently chosen from 

{1 ,...,p}, then we define y n (p ) the set of all lattices zvhose basis are of the following form: 


/ V 

0 .. 

. \ 

*1 

1 .. 


\x n -1 

0 .. 

• V 


then the uniform distribution over y n {p) is statistically close to to the uniform distribution over X„(p). 

Remark. This theorem directly yields an simple alternate algorithm of sampling integer lattices, which satisfies 
randomness guarantee, although it is also possible to generate the uniform distribution of integer lattices of given 
volume, using a different process. It suffices to generate a big prime number p along with n — 1 random integers 
in the interval [1, p — 1], Hozvever, it is not clear whether the theorem still holds ifzve remove the condition that 
the volume is a prime p. 

2.4 Lattice reduction 

Among all the bases of a lattice, some bases are superior to others in better revealing the structure of 
the lattice. Informally, lattice reduction is the process of searching such "superior" bases. In practice, 
lattice reduction plays a core role in many lattice problems. The efficiency of reduction algorithms is a 
key element to evaluate the security of many lattice-based crypto-systems. 

Unlike bases in a vector space, a lattice usually does not have an orthogonal basis. However, we 
can attempt to get a basis that is close to being orthogonal. Since the lattice volume - the volume of the 
parallelepiped spanned by lattice basis is fixed, better orthogonality also means shorter basis. 

From the introduction of successive minima, we know that even the vectors in the shortest possible 
basis could exceed the values of successive minima. In theory, the shortest basis can be defined by the 
leading element among all bases by sorting (||bi ||,..., ||b„ ||) in lexicographic order. 

From an algorithmic point of view, the notion of Hermite-Korkine-Zolotarev(HKZ) reduction is 
more interesting. An HKZ-reduced basis is more relaxed than the shortest basis, and still, it approx¬ 
imates the successive minima. Besides it has local properties which allow inductive analysis using this 
properties. Besides, many other weaker notions and algorithms stem from this HKZ-reduction, which 
makes it very important. 

2.4.1 Hermite-Korkine-Zolotarev reduction 

Size-reduction is a much weaker reduction notion introduced by Hermite. Intuitively speaking, in a 
size-reduced basis, for each pair of basis vectors b ; and b, with / < /, it assures that 11 by || = min nG z ||b ; - + 
flbj||. In other words, by can not be made shorter by simply subtracting a multiple of b,. 

Definition 2.4.1 (Size-reduced). A basis b = {bi,... ,b„} of a lattice is size-reduced if its Gram-Schmidt 
orthogonalization satisfies, for all 1 ^ / < i ^ n, 



The following is a description of the size-reduction algorithm. 
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Algorithm 3 Size reduction algorithm 
Input: A basis B = {bi,... ,b n } of the lattice C. 

Output: A size-reduced basis {bi,... ,b„} 

1: Calculate the GSO of the basis [M, D, O] <— decompose(B). 

2: for i = 2 to n do 

3: for j = i — 1 downto 1 do 

4 : bi <- bi- \m,j\bj 

5: for k = 1 to / do 

}L,k ^ Pi,k \Hi,j\Hj,k 

7: end for 

8 : end for 

9: end for 


The size reduction algorithm runs in polynomial time with respect to the lattice rank n. Computa¬ 
tionally, size-reduction of a lattice basis is analogous to Gram-Schmidt orthogonalization of a basis in 
R m . Indeed, if we were working in a vector space, and by replacing the 4th line by b, <— b, — Ylj =[ /G/b j 
we would obtain an orthogonal basis. In the size-reduction algorithm, we round p^j so that b, remains 
an integer combination of the basis and b, G C. This ensures that the projection of b, onto the line in 
the direction of b* for all j < i is smaller than \ ||b*||, and putting it all together, we have the following 
inequality: 



The HKZ reduction notion is related to shortest vectors. Starting from i = 1, we can inductively 
assign b* to be the shortest vector of TCj(C). 


Definition 2.4.2 (HKZ-reduced). A basis B = {bi,... b,,} of a lattice is HKZ-reduced if it is size-reduced and 
such that for all 1 ^ i ^ d, ||b*|| = Ai ( zr,- (22)). 


The algorithm which makes n calls to the SVP algorithm. Its description is given as follows. 


Algorithm 4 HKZ reduction algorithm 

Input: A basis B = {bi,... ,b„} of the lattice C, a factor e 

Output: A HKZ-reduced basis {bi,..., b„ } 

1: for i = 1 to n — 1 do 
2: b' <- SVP(7T,-(£(B))) 

3: Compute x such that b, = x • zr, (B) using algorithm 1. 

4: Cf- X- (bi,...,b„). 

5: B Find proper (b !+ i,..., b„) to complete (bi,..., b,-_i, c, b !+ i,... b (! ) into a basis of C. 

6: B Size-reduce(B) 

7: end for 


Line 5 is feasible because of the following lemma. 

Lemma 2.4.3. Let v be the shortest vector of the lattice C( B), then £({v}) is a primitive sublattice ofC(B). 
Proof, {v} is a basis for the lattice £(B) n span(v ). □ 
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In the algorithm, 71/(c) is the shortest vector of lattice 7t/(£(B)), and therefore it can be completed 
to a basis of 7t/(£(B)) in the form of 7t/(c), 7r,(b/ + i),..., 7r/(b„)), then {bi,...,b/_i,c,b/ + i,...,b„ is a 
basis of lattice £(B). 

The length of HKZ-reduced basis vectors is a good approximation to the successive minima. Ob¬ 
taining a HKZ basis is no easier than solving SVP, thus an HKZ reduced basis is unlikely to be obtained 
by polynomial-time algorithms. 

2.4.2 LLL reduction 

The first algorithm solving ASVP within bounded approximation factor in polynomial time is the LLL 
algorithm. The idea is generalized from 2 dimensional case. 

Definition 2.4.4 (Lagrange-reduced). Let £ be a two-rank lattice of IR". A basis of (bi,b 2 ) of £ is said to be 
Lagrange-reduced if and only ifb\ and b 2 satisfies the following two conditions: 

1. ||bi || ^ ||b 2 1|, and 

2 . |(b 1 ,b 2 )| ||b 1 || 2 /2. 

The geometrical representation of a pair of Lagrange-reduced basis vectors is given in figure 2.4, 
where the possible value of b 2 with regard to bi is the blue region. It is not hard to deduce that the pair 
of vectors satisfy the following relationship. 


M ^ ||b$|| ^||b 


Hence we have an upper bound for ||bi ||: 

/ A \ 1/4 

Il bi ll < (jJ vol(£({ bi,b 2 })) 1/2 

In fact, this is the only natural reduction notion for vectors of dimension 2, in the sense that both bi and 
b 2 is already optimized: 

Theorem 2.4.5. Let (bi,b 2 ) be a basis of a two dimension lattice £ of R m . The basis (bi,b 2 ) is Lagrange- 
reduced if and only if ||bi|| = Ai(£) and ||b 2 || = A 2 (£). 

Proof It is easy to see the sufficiency of ||bi|| and ||b 2 |j being A ] (£) and A 2 (£) respectively. It suffices 
to prove the necessity. In a pair of Lagrange-reduced basis, we can write b 2 as rbi + b\ where r and b£ 
satisfying |r| ^ 1/2, and ||b£|| 2 ^ (1 — r 2 )||bi|| 2 . For any vector v G £,there exist x,y G Z such that 
v = xb] + yb 2 , and its norm is 

ll v ll = V // ^ + r y) 2 |l bl H 2+ y 2 H b 2l| 2 

^ \jx 2 + 2rxy + y 2 ■ |jbi|| 

Since |2rxy| < \xy\ < max(x 2 ,y 2 ), we have ||v|| ^ min(x 2 ,i/ 2 ) • ||bi||, thus any non-zero v can not be 
shorter than bi- 
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Because ||bi|| 2 ^ ^ |||bi|| 2 , we have 



lb? 


2 


2 


For any vector v that is linearly independent with bi, || v|| 2 ^ y 2 HbJ || 2 ^ §y 2 ||b2 1 | 2 - Thus if |y| > 1, then 
|| v|| > ||b 2 1|- With \y\ = 1, ||v|| is minimized with x = 0, which means that b 2 is the shortest among all 
vectors that are linearly independent with bj. □ 


Figure 2.4: A Lagrange-reduced basis 



Given a basis (bi,b 2 ) of a lattice C, a Lagrange-reduced basis can be obtained by following algo¬ 
rithm which is attributed to Lagrange: 


Algorithm 5 Lagrange reduction algorithm 

Input: A basis of 2 dimension lattice C. 
Output: A Lagrange-reduced basis of C. 

1: size reduce (bi,b 2 ). 

2: while ||bi || > ||b 2 1| do 
3: swap bi and b 2 - 

4: size reduce (bi,b 2 ). 

5: end while 
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There are many ways to extend Lagrange reduction notion into arbitrary dimensions. Among them, 
LLL reduction proposed in 1982 by Lenstra et al. [51] serves is the first one which can be achieved in 
polynomial time. 

Definition 2.4.6 (LLL-reduced). A basis B is LLL-reduced with factors (//, <5) such that \ < 5 < \ and 
1/2 < i] < \fb, if it is size-reduced in a relaxed sense, 

\m,j\ < '//Vi < j (2.6) 

and its Gram-Schmidt orthogonalization satisfies the Lovasz condition, 

l|b* +1 + /6'+ 1 ,/b*|| 2 ^ £||b*|| 2 (2.7) 


for 1 ^ i < n. 

In this definition, if the factor S tends to 1, then condition 2.7 equals to condition 1 in Lagrange 
reduction. But S is usually set to be smaller than 1 to guarantee polynomial running time. In the 
original literature, this factor is fixed to be \, for the simplicity of presentation. When factor ;/ tends 
to 0.5, it would be exactly the condition for size reduction, and for many softwares such as NTL and 
magma, the default value of (//, 5) is set to be close to (| + e, 1 — e), which improves stability, yields 
shorter ||bi|| but costs longer termination time [91]. With absence of specifying ( rj,S ), we will follow 
this convention of softwares. The LLL reduction algorithm [51], which is described in algorithm 7 in 
the next chapter, ensures the output basis to be LLL-reduced upon termination. 

For a LLL reduced basis, the neighboring basis vectors satisfies 

||b*|| ^ (^ — ? / 2 ) 1/2 ||b* +a ||, VI <i<n (2.8) 

Altogether, this amounts to the following bound on ||bi ||: 

||bi|| sS (A — ;/ 2 ) (n “ 1)/4 . vol(C) 1/n (2.9) 

And the basis vectors are also approximations to A,■(£) within bounded factor: 

||b/|| {S-t] 2 ) {n ~ 1)/2 -Ai(C) (2.10) 

which implies that bi provides a solution to SVP^_ 2 )(»-i)/ 2 . For (rj,6) close to (0.5,0.999), the approx¬ 
imation factor is upper bounded by 1.075" -1 . In practice, when using a random lattice, the average 
approximation factors is smaller than this upper bound. According to Gama and Nguyen's experi¬ 
ment [28] on numerous random lattice bases, on average we have ||bi|| ~ 1.021 n vol in high 
dimension. 

For an input lattice C C Z m of dimension n, whose entry size is bounded by B 6 Z, the origi¬ 
nal LLL algorithm runs in time complexity of 0(n 5 m log 3 B). This complexity is later improved to be 
0(n 4 m log B(d + log B). The wide application make LLL reduction algorithm a popular field of study, 
and its algorithmic details are presented in section 2.5. 

2.4.3 BKZ reduction 

Block Korkine-Zolotarev(BKZ) reduction with a parameter 2 < f> < n is stronger than LLL reduction. 
Upon termination, the output basis is guaranteed to be BKZ-reduced, with the following property. 
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Definition 2.4.7 (BKZ-reduced). A basis B is BKZ-reduced with a block size f> and with a factor ( rj,S) such 
that 1/4 < e < 1, if it is LLL-reduced with factor (ij,$), and each b* satisfies that ||b*|| = Ai(£(B^))/or 
1 ^ i < n and k = min(z + — 1, n). 

This can be viewed as a generalization from LLL-reduction, for f> = 2, the definition amounts to 
LLL-reduced basis. 

To make a basis more reduced than LLL-reduced, the Blockwise-Korkine-Zolotarev (BKZ) reduction 
algorithm [87] is the best algorithm known in practice. It outputs a BKZ-reduced basis with blocksize 
f> > 2, from an input basis B = (bi,..., b n ) of a lattice C. It starts by LLL-reducing the basis B, then 
iteratively reduces each local block Vi)] for j = 1 to n, to make sure that the first vector of 

each such block is the shortest in the projected lattice. This gives rise to Algorithm 6, which proceeds 
in such a way that each block is already LLL-reduced before being enumerated: there is an index j, 
initially set to 1. At each iteration, BKZ performs an enumeration of the local projected lattice 
where k = min(y + f> — 1, n) to find v = {v \,..., v„) 6Z" such that || 7Ty(E;=y v foi) || = M We let 
h = min (ft: + 1, n) be the ending index of the new block in the next iteration: 

• If ||b*|| > Ai(L^j), then b f,ew = Ef=y G'b, is inserted between by_i and by. This means that we no 
longer have a basis, so LLL is called on the generating set (bi,..., by_i, b" ew , by,..., b;,), to give 
rise to a new LLL-reduced basis (bi,..., b;,). 

• Otherwise, LLL is called on the truncated basis (bi,..., b;,). 

Thus, at the end of each iteration, the basis B = (bi,..., b„) is such that (bi,..., b;,) is LLL-reduced. 
When j reaches n, it is reset to 1, unless no enumeration was successful, in which case the algorithm 
terminates: the goal of z in Alg. 6 is to count the number of consecutive failed enumerations, to check 
termination. 

Algorithm 6 The Block Korkine-Zolotarev (BKZ) algorithm 

Input: A basis B = (bi,..., b n ), a blocksize f E {2,...,«}, the Gram-Schmidt triangular matrix U and 

IIMII 2 .||b*|| 2 . 

Output: The basis (bi,... ,b„) is BKZ-/3 reduced 
1: z 0; j <r- 0; LLL(bi,..., b n , U);// LLL-reduce the basis, and update p 

2: while z < n — 1 do 

3: j (j mod (h — 1)) + 1; k <— min(y + f> — 1 ,n); h <— min(fc + 1 ,n);//define the local block 

4: v Enum(|/p], ||b* || 2 ,..., ||b| || 2 ); // find v = (vj, ...,v k ) G Z k ~i +1 - 0 s.t. |J 7Ty(E-Ly U/b/) |j = 

A i i L \j,k]) 

5: if v -f (1,0,... ,0) then 

6: z i 0; LLL(b!,..., £zL=y v,b,, by,..., b,„ U) at stage y; //insert the new vector in the lattice at the start 

of the current block, then remove the dependency in the current block, update p. 

7: else 

8: z i — z + 1; LLL(bi,..., b/„ U) at stage h — 1; // LLL-reduce the next block before enumeration. 

9: end if 

10: end while 


This forms the main topic of this work and we will discuss in great detail in following chapters. 

2.4.4 Characterizing reduced bases 

A common way to check the quality of basis is to look at the sequence of Gram-Schmidt norms 
||bj ||,..., ||b*||. For random lattices, a good basis has a sequence where the decrease is slow. In prac- 
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tice, it has been observed that the GS norms for basis produced by reduction algorithms have a typical 
shape, such that the sequence ||b*|| form a geometric progression of ratio ||b*||/||b* +1 || = q, and Fig. 2.5 
shows an example in reduced basis of dimension 100. 

2 
1.5 
1 

0.5 
0 

- 0.5 
-1 
- 1.5 

10 20 30 40 50 60 70 80 90 100 

Figure 2.5: Example of Gram-Schmidt norms for a 100 dimensional lattice of unit volume, after 
different reductions. The y axis shows the value ln( ||b* ||), which is almost linear in indices i. 



Lll reducecl 
BKZ-20 reduced 
BKZ-75 reduced 



Intuitively, this can be explained by the fact that many reduction algorithms such as LLL and BKZ 
reduction have an inductive property: when a basis B is reduced, so is 7i,(B) for the same parameter. 
Especially for LLL reduction, the reduction criteria is imposed only for neighboring two basis vectors. 

Numerically, it is usually useful to look at the following numerical values which evaluate the quality 
of "being-reduced" for a basis. 

Definition 2.4.8 (Root Hermite Factor). The root-Hermite-factor of a basis B is defined as 

<S(B) = (||b 1 ||/uo/(/:) 1/,, ) 1/ ' ! (2.11) 

For random lattices, it has been observed that a typical LLL-reduced basis will have <5 (B) = 1.021, 
and the S value of a typical BKZ-reduced basis is related to f, for example, a typical BKZ-20 reduced 
basis have £(B) = 1.0128 and BKZ-30 reduced basis have <i(B) = 1.0134. 

Table 2.1: Average root Hermite factor for different reduced basis of random lattices 



LLL 

BKZ-20 

BKZ-50 

BKZ-85 

BKZ-110 

<5(B) 

1.021 

1.013 

1.012 

1.001 

1.009 


It is clear BKZ reduction with bigger f outputs more reduced basis, and root Hermite factor ap¬ 
proaches to 1. But as the target root Hermite factor gets closer to 1, the speed of approaching 1 is in¬ 
creasingly slower. This is illustrated in figure 2.6, which shows average root-Hermite factor for BKZ -f> 
with f up to 250. 

The prediction and significance of root Hermite factor of reduction is a central topic of this work 
and more discussions will be presented several times in later chapters. 


— 43— 






2. Introduction to Lattices 



Figure 2.6: Expected root Hermite factor for BKZ-/3 reduced basis of random lattices. The x axis is the 

block size of reduction f. 


Definition 2.4.9 (Half volume). The original definition of half volume applies for basis B with even dimension 

n "£ i|b*n 


17(B) = 


n ",„ /2+1 lib* 


( 2 . 12 ) 


We extend this definition to the case zvhere n is odd by convention. 


,,(B) = 


nK 2J iib-n 

n ; , . I „ /21+1 »b*i 


(2.13) 


These two definitions measures the quality of the basis in two different aspects. The root-Hermite 
factor describes the distance between current bi and (£) or vol (£) ]/n , while the value of half volume 
is related to the cost of enumeration on the basis vectors, which is the time complexity of SVP. If we 
approximate ||b*|| with a geometric progression with ratio 1 ], then we can deduce that 


7(B) 



(2.14) 


and 

*(B) « y/q. (2.15) 

The link between lattice reduction and SVP is two-fold. First of all, it usually provides an approx¬ 
imate solution to SVP. In case where n is small, it will provide a solution to SVP with non-negligible 
probability [28]. In addition, it is preferable to perform enumeration on a reduced basis to obtain even 
shorter vectors than current bi. We will explain the enumeration procedure in following chapters. 


2.5 LLL Reduction 

The invention of LLL reduction algorithm has lead to important breakthroughs in many fields including 
cryptology, integer programing, algorithmic number theory, computer algebra, and so on. Since its 
first publication in 1982, a lot of efforts were made to improve its performance, or study its behavior. 
Following the original algorithm using rational computation, many improvements were achieved by 
using floating point computation for approximation. 

Let £ C Z m be a lattice of dimension n, represented by a basis B , such that the absolute value of its 
entries are bounded by integer B. The complexity of the original algorithm is 0(n 5 m log 3 B), which has 
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a total degree of 9. The first provable floating-point version is given by Schnorr [86], which reduce the 
complexity to a total degree 8. The original algorithm has a cubic complexity in the bit size of entries. 
This complexity is passed to quadratic [71], and recently it is reduced to quasi-linear [73]. 


Table 2.2: Complexity bounds of the original LLL and the floating-point LLL algorithms 


LLL [51] 

Schnorr [86] 

L 2 [71] 

L 1 [73] 

0(n 5 m log 3 B) 

0(n 3 m(n + log B ) 2 log B) 

0 (n^m log B (n + log B )) 

0(n 5+e log B + ni LO+1+£ log B 1+£ ) 


In practical side, several software implementation are available, such as NTL, fplll, magma, LiDIA, 
Maple. Among them, fplll is a dedicated LLL reduction program, which is the most efficient and is still 
frequently updated, while other programs are components of some larger computer algebraic libraries, 
so that LLL can be used together with other algorithms with convenience. 

2.5.1 Algorithm description 

The original LLL reduction is described in algorithm 7. 

Algorithm 7 LLL reduction algorithm 

Input: A basis B = {bi,. .. ,b n } of the lattice C, factors (?/, 5) 

Output: A LLL-reduced basis {bi,..., b„ } 
l: i = 2; 

2: while i ^ n do 

3: size reduce(B,) with rj. 

4: if || b* + }iij- ib*_ 1 || 2 < ^||b*_ 1 || 2 then 

5: swap(b/,bj_i); 

6: if i > 2 then i 4— i — 1 end 

7 : else 

8: i <— i + 1 

9: end if 

10: end while 


A typical choice of ( rj,5 ) is (0.501,0.999), when the algorithm achieve almost best output quality. 
The computation of GSO coefficients here uses the exact value of /q-y with algebraic representation. 
During execution of the algorithm, it keeps the loop invariant that the partial basis B ; = (bi,..., b,) 
is LLL-reduced. Therefore upon termination, the whole output basis is guaranteed to be LLL-reduced. 
However, the polynomial complexity of this algorithm is not evident from a glance. A common proof 
is to use a volume argument due to Lovasz. 

2.5.2 Polynomial running time proof 

In line 6 of the algorithm, the index i goes backwards, which is the main obstacle to show the polynomial 
running time of this algorithm. The sketch of the proof goes like following. We will first define an 
integer value D. By showing that each time line 6 is executed, D will decrease at least by a factor of 
5 < 1, we show that this line will only be executed by at most polynomial time. 

Define d t = n; =1 ||b*|j 2 , and D = nj =1 dj. Note that d, is the volume of the lattice B,, therefore is 
an integer, thus so is D. Each time line 6 is executed, a swap takes place, and only b*_ 1 and b* can 
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be modified among all Gram-Schmidt vector norms. To avoid confusion, we denote c* and c* 1 the 
renewed value after swap. We have 


lb*-! | 


= C 


i —11 


and as Lovasz' condition is not satisfied, we have ||c*_ 1 || 2 < ^||b*_ 1 || 2 . Therefore, d ; _i decreases by at 
least a factor of 5, while all other dj, j / i — 1 is not modified. Consequently, D decreases by at least a 
factor of (5. It follows that the number of swaps is at most logarithmic in the initial value of D, which 
can be upper bounded by B ln . Thus the while loop will be executed at most 0(n 2 log B) times before 
index i is augmented to n. 

Finally we still have to bound the the size of during arithmetic operations in computation. Anal¬ 
ysis in [51] provide a proof of bound for the size of This completes the proof and concludes to the 
following theorem. 

Theorem 2.5.1. For a lattice basis B with integer entries, algorithm 7 computes LLL-reduced basis in time 
polynomial in logB. 


2.5.3 Floating point algorithms 

The computation of GSO in the original LLL reduction uses exact arithmetic computation, which is 
expensive. A straight forward idea is to use floating point arithmetic to speed up. The first floating point 
algorithm, consists of replacing simply the value of (b;,by) and thus }i h! . However, this idea suffers 
from a number of drawbacks, and require additional reparation routine to guarantee the correctness of 
output basis. 

Schnorr reduction The first provable variant of LLL reduction is provided by Schorr [86], providing 
a complexity of a total degree of 7. The main idea consists of using a finite number of iterations 
of Schulz's method to maintain the accuracy of Newton's iteration for the inverse of a matrix. 
By taking the working precision to be i = C\n + C2B, where C\ and C2 are explicitly computable 
variables, Schnorr proved that this algorithm manage to terminate and returns a (0.55,0.95) LLL 
reduced basis. 

L 2 reduction The L 2 reduction was introduced by Nguyen and Stehle [71]. It has been proved to have 
quadratic complexity with regard to log B. During computation, the Gram matrix of the initial 
basis matrix is computed both at the beginning of the algorithm and updated for each change 
of the basis vectors. Besides, It uses a lazy size-reduction procedure, which repeatedly does size 
reduction, updates the Gram matrix, until size reduction no longer alter the underlying matrix. 
The frequent renewal of Gram matrix and the lazy size reduction allows to set up a working 
precision much lower than the previous Schnorr reduction. The working precision is proved to 
be c m 1.58 d, which is independent of log B compared to > 12 d + 71og 2 B in Schorr's reduction. 
The number of times of applying transformation on our original basis matrix is O(logB), while 
the original basis has entry size 0(log B). Consequently, the total complexity is only quadratic in 
log B, which explains the name of the algorithm. 

L 1 reduction This algorithm, proposed by Novocin et al. in 2010 [73], is the first reduction algorithm 
quasilinear in log B. Like in L 2 , the working precision required by the L 1 algorithm is also inde¬ 
pendent of log B. But instead of applying our transformation directly to the original basis matrix 
as L 2 reduction does, in L 1 reduction the renewal of the underlying Gram matrix is done in a total 
of 0(log log B) different levels, Bi, B 2 ,_The entry of B; is composed of first 2' 1 • c bits of the 
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original basis matrix. The transformation in level i is carried to level i + 1, only when level i is 
fully reduced. This delayed transformation ensures that the renewal of the matrix in level i is only 
done for 0(2 loglogB ^ 1 ) times. This gives the quasilinear complexity of the reduction algorithm. 

Figure 2.7 is an illustration of the L 2 and the L 1 reduction algorithms. 

Figure 2.7: Illustration of L 2 and L 1 reductions 


Bi 



2.6 Solving SVP 

The two main algorithms for solving SVP and CVP and their approximate versions are the sieve and 
enumeration algorithm. 
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2.6.1 Sieve algorithm 

We only present a brief introduction of the sieve algorithm, and more focus will be put on enumera¬ 
tion algorithm. Sieve algorithm was first discovered by Ajtai et al. [6] in 2011. This is a probabilistic 
algorithm which is asymptotically better than the deterministic enumeration algorithm. The algorithm 
starts with a basis that is already reduced. 

The idea is to sample a lot of random lattice points in a ball B n (r) f'l C, and as the number of points 
increases, at some point there will be two points vi and V 2 such that |v| — V 2 11 < r/2. This is similar 
to finding collision in birthday attack, and the number of points need for collision can be analysed 
rigorously. At the end of sampling we expect to find many such pairs of points, and all their differences 
lie in B n (r/ 2) n C. Iteratively we reduce the value of r by a constant factor, and repeat the sampling 
procedure again. This is the sieving procedure which is the most expensive in the algorithm. At the end 
we will be able to find a short enough vector. 


2.6.2 Enumeration algorithm 


The enumeration algorithm is deterministic in its original appearance in the 1980s. It is the simplest 
method for solving SVP, by searching directly for a proper integer combination of the basis vector. 
It consists of exhaustively searching for the projections of any shortest vector in the projected lattices 
TCi(C), from i = n to 1. In search for a shortest vector v in a n dimensional lattice C, where the length 
of v is known to be bounded by r, the algorithm tries to search for all vectors in 7r,(X) with length < r, 
and then lifts them to 7r;_i(£), and keeps those vectors in 7r,_i (£) with smaller length than r, and so 
on. Until i = 1, the shortest vector is picked out. The estimation can be obtained by Gaussian heuristic, 
or with a much conservative estimate using a known vector, for example r = ||bi ||, or J~YnVol{C)^ n . 

The actual algorithm travels through the enumeration tree by depth first search, where the each 
node can be derived from its parents and its preceding siblings. Therefore the memory requirement 
besides GSO is linear with the enumeration tree depth, which is the dimension n. However, the time 
complexity of this algorithm is exponential. In fact, the total number of nodes visited in layer i is the 
number of vectors in TZj(C) with length bounded by r, which can be evaluated by Gaussian heuristic. 

The number of nodes in level i, i.e, the number of vectors in lattice n t ( C) with norms smaller than r 
is approximately 


Ni = 


yYl i~\~ \ y* , 

' Vn—i+l 

vol{rti{C)) 


(2.16) 


Because ^J r )' n = 0(n), r is usually chosen to be vol(C) 1 /n Q( \Jn), unless we are given other addi¬ 
tional information. Now, consider the layer i = [n/2\, 


N [n/2\ 


r n-[n/2\+l 


Uj— [n/2j+l 


y/vol(C)/t](B) 
©(?/(B) 1/2 • (erc) n/2 ) 


According to the definition of //(B) in equation 2.14, Nj„/ 2 j is super-exponential in n. Therefore so 
is the complexity of enumeration algorithm. 

Although sieving has a smaller asymptotic complexity in theory, in practice, enumeration algorithm 
has a small constant which makes it more competitive than the former algorithm. 
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2.7 Computational challenges proposed by Darmstadt University 

A series of lattice problems are proposed by Darmstadt University to challenge solve lattice problems 
of different dimensions on a website [54], This serves as an interface to test and compare efficiency 
of problem solving by different methods. Besides, it provides a reference value for the present limit 
dimensions of problems that are solvable in practice. 

So far, three categories of problems are proposed. For each category it provides problems in differ¬ 
ent dimensions, where the problems can be either downloaded or generated by given source code. It 
provides a list of problem solvers ranged in the order of the dimension of the problem and the quality 
of the solution. 

The challenge includes following three categories: 

Finding short vectors in Ajtai lattice Following Ajtai's work [4], a special class of lattice basis is con¬ 
sidered to be the hardest among all lattices. The solutions of all other lattices are implied in some 
of these hardest lattices. The challenge consists of finding short vectors in some of these Ajtai lat¬ 
tices of high dimension. The construction of these lattice bases and the proof of existence of short 
vectors in each of the corresponding lattices is presented in [17]. 

Challenge lattices are provided in different dimensions, where initially ||bi|| = q with q being 
some known constant. One need to either find a shorter vector than b\ in a higher dimension of a 
challenge that is not solved before, or find a even shorter vector than the previous solution. 

SVP for random lattices This challenge requires to find solutions to HSVP 7 for random integer lattices, 
where 7 = 1.05 • ' ^ —• For each dimension, it is possible to generate different sample lattices 
with different integer seeds, and a solution to any of them is taken into account. The challengers 
are required to either find a solution to PISVP 7 for a dimension unsolved before, or find a solu¬ 
tion with a better approximation factor for HSVP compared to the previous solution (it can be a 
solution of a different lattice of the same dimension). 

SVP for ideal lattices While the previous challenge focus on SVP hardness on random integer lattices, 
this challenge tries to the test hardness of ideal lattices. The class of ideal lattices is a special class 
of lattice with some additional algebraic structures. It is used to build efficient primitives and 
homomorphic encryption scheme, which require less key length and faster operations thanks to 
its algebraic structure. 

Ideal lattices are often defined in terms of polynomial rings. For a polynomial v(x ) = v n x n + • • • + 
Vo, its vector representation is (v „,..., Vo ). The definition of a ideal lattice is the following: Let 
fix) be monic polynomial, i.e. the coefficient of the largest exponent is 1. Then the set of all vector 
representations of the polynomials in the ring ring Z[X]//(x) form a ideal lattice. Especially, 
for fix') = x" — 1, this lattice is cyclic. The most often used fix) is fix) = x n + 1 and fix) = 
x n + x n ~ x + • • • + 1. 

It is unknown if SVP could be much easier on ideal lattice by using its ideal structure, which 
motivates to set up a separate challenge for SVP for ideal lattices [75]. The challenge is further 

Y(n /2+l') 1 / n 

separated into two different HSVP 7 challenges, with 71 = 1.05 • ———^—, and 72 = n. 


2.8 Notations 

To conclude this chapter, we list all the main notations in the following table for convenience of refer¬ 
ence. 
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Notation 

Interpretation 

C 

A lattice in R m 

n 

Dimension of a lattice 

B = {bi,...,b„} 

Basis of a lattice 

B, 

The sub lattice generated by the first i vectors in B 

b) 

The projected lattice of B on the orthogonal supplement of the linear span of B; 

£( B \j,k]) 

£(rc/(B k )) 

vol(') 

Volume (of a lattice C). 

B n 

A ball of dimension n with unit radius 

V n 

|B* || 

The volume of B n 

Gram Schmidt norms of a basis 

Vi,j 

<b„b;> 

llb/ll 2 

A i(C) 

Norm of the i- th Minkowski's successive minima in lattice C 

GH(JT) 

Prediction of Ai(£) by Gaussian heuristic. GH(jC) = (vol(C)/v n ) 1/n 

m 

Root Hermite factor of a basis. <5(B) = ( bi \\/vol(C) 1/n ) 1/n 

v(£) 

Half volume of a basis. rt{ B) = „„ i=1 — ifAnj 

/V ' n, = r „ /2l+ l lib; || 


Table 2.3: Principal notations used in this work 
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Enumeration for SVP 
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Enumeration is the one of the most basic algorithms for solving many lattice problems including 
exact and approximate version of SVP and CVP. Besides being a stand-alone algorithm, it is also an 
essential sub-procedure for BKZ reduction. As a stand-alone, its performance is dependent on how 
much the basis is reduced. And as a sub-procedure for lattice reduction, its complexity influences the 
behavior of the reduction algorithm. Speeding up enumeration is the central topic of this chapter. 

In this chapter we will focus on enumeration procedure for SVP, for the need of discussion for BKZ 
reduction. Though the enumeration procedure for CVP is very similar. We will begin by discussing the 
properties of short vectors in random lattices, then proceed by talking about the enumeration procedure 
and its analysis. The exhaustive search in full enumeration procedure can be greatly improved by 
abandoning branches which are less likely to lead to solutions. This is called pruned enumeration. 

The pruned enumeration turns the sequential deterministic algorithm into a probabilistic algorithm 
which can be implemented with parallel computation. The pruned enumeration uses a pruning func¬ 
tion to decide whether a branch should be rejected. This pruning function is precomputed prior to 
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enumeration, and is a deciding factor of running time. With a given basis, it is possible to compute a 
quasi-optimal pruning function, which minimizes the expected time for computing a shortest vector. 

In high dimension, searching for the solution of the exact SVP by enumeration is infeasible in prac¬ 
tice. But we can still hope to solve the approximate SVP 7 problems over random lattices. In fact, for a 
random lattice of dimension n, there are approximately 7 ” vectors within the radius of 7 • A| (C). Taking 
the distribution of short vectors into consideration, we are able to find proper pruning functions which 
allow enumeration of basis higher than 200 . 

A pruning function of dimension n is a vector of dimension n. The pruning function is a solution 
of an optimization problem, which is specific for each basis. Initially, we should always precompute 
a pruning function whenever we are given a new basis, and this is done by searching the solution 
in a n/2 dimensional space. Regardless of the searching method and initialization of the search, the 
optimization problem always appears to converge to the global minima, which suggests that we can 
treat the problem as a convex optimization problem, and use faster algorithms than random search. In 
addition, the pruning function display very similar shape in general, which shows that they lie in a 
subspace whose dimension is much smaller than n / 2. Extracting this subspace from abundant data in 
high dimension is a well studied subject in data mining, and is usually addressed by using principal 
component analysis(PCA) method. Our PCA shows that these data are distributed almost in a subspace 
of dimension 4, which means that it is possible to express it with a linear combination of 4 terms, in other 
words, it is enough to decide a bounding function with 4 parameters. Eventually, we present a way to 
effectively parameterize the bounding function so that it can be generated very simply and quickly. 

Since the complexity is also affected by the reductions applied prior to enumeration. We will also 
dedicate a section for the discussion of the basis quality. 


3.1 Distribution of short vectors 

In this section, we investigate the distribution of the shortest vector in a random lattice, and in general, 
the joint distribution of the first N short vectors in a random lattices. The notion of random lattices is 
given in section 2.3, and is based on the definition of uniform distribution over all lattices of the unit 
volume. 

According to the results of Sodergren [94], the norms of N first shortest vectors define a stochastic 
process, which converges to a Poisson process when the dimension n tends to infinity. In particular, 
A \{C) of a random lattice defines a variable which converges to an exponential distribution. Inter¬ 
estingly, the expectation of number of short vectors according to this distribution coincide with the 
prediction by Gaussian heuristic. Experiments show that this distribution converges very fast, and the 
predictions corresponds to the experiments as soon as n > 10 . 


3.1.1 Poisson process 

Definition 3.1.1 (Exponential distribution). The exponential distribution with parameter A is characterized 
by following probability density function (pdf): 


f{x; A) 


(Ae Xx , x^O 

\ 0, x < 0. 


The exponential distribution describes the intervals of events in a Poisson process, its has expecta¬ 
tion E(X) = 1 /A, and variance Var(X) = 1/A 2 . 

A Poisson process can be described with 3 collection of variables: 
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• The sequence of interval times X = (Xj, X 2 , ■ ■ ■), where X, is the time interval between i — 1-th and 
z-th event. 

• The sequence of arrival times T = (Tj, T 2 , ■ ■.), where T, is the time point at which z-th event takes 
place. 

• The counting process N = ( Nt,t X 0), where Nt denote the number of arrivals in (0,1] for any 
non-negative t. 

Clearly, T, is the partial sum of the first z elements in X, 

T, = t X i 

i=1 

Xi = Ti - Tf_! 


T and N are inverses of one another: 

T n = min{f ^ 0 : Nt = n},n G N 
Nt = max{zz G N : T n ^ t},t G [0,oo) 

Definition 3.1.2 (Poisson process). A Poisson process of intensity A is a process zvhose interval times are i.i.d 
variables of exponential distribution with parameter A. 

In Poisson process, events occur continuously and independently at a constant average rate. Poisson 
process has an important property of being memoryless, i.e., the number of arrivals occurring in any 
bounded interval of time after time t is independent of the number of arrivals occurring before time t. 


3.1.2 Short vectors in unit volume random lattices 

Let X n be the space of the all unit volume lattices of dimension zz. For a random lattice C G X n , we 
study the norms of non-zero vector pairs ±v . 

Given a lattice C G X n , we order its non-zero vector norms as 0 < /1 C ^2 C C • • • where we count 
the common length of the vectors v and — v only once. For j X 1, we define 


u t = v n • If, (3.1) 

where v n is the volume of a zz dimensional unitary ball, so zz,- is the volume of the zz dimensional ball of 
radius 


Theorem 3.1.3 (Distribution of short vectors). [94] For any fixed N, the N-dimensional random variable 
(zzi,..., zzn) converges in distribution to the distribution of the first N points of a Poisson process on the positive 
real line with intensity 1 /2 as n —> 00. 

Corollary 3.1.4. For a lattice C G X n , the distribution ofv n ■ \\ ( C) n converges in distribution to the exponential 
distribution with parameter 1/2 as n —> 00. 


The cumulative distribution function (cdf) of variable X with exponential distribution is given by: 


Pr(X <x) = 


j1 - e~ xl2 

l 0 


x ^ 0 
x < 0 


(3.2) 
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Then Ai (£) defines the variable Y to be Y = (X/v n ) 1/n . With a substitution of variable x = v n • y n 
we have: 


Pr(Y < y) = | 

Then the pdf of variable Y is: 

/(y) = 

And the limit of expectation E(Y) is: 


1 - e -y n -v n /2 o 

0 y < 0 


• e -y n ' vJ2 y^ 0 

0 y < 0 


(3.3) 


(3-4) 


E(r)= EiM 

V Vn J n 

-(kT-wi 

Here, T (x) function can be developed into Taylor series at x = 1: 

T(1 + e) = 1 + e ■ y + o(e). 


(3.5) 


where 7 ~ 0.577 is the Euler-Mascheroni constant. 

Upon this we can summarize that E(Ai(£)) tends to 

as n tends to + 00 . 

And by calculating the expectation of N/, the number of arrivals in the Poisson process, we obtain 
the following corollary, which coincide with the prediction given by Gaussian heuristic. 

Corollary 3.1.5. In a random lattice C of dimension n, for r > 1, the total number of lattice vectors (including 
the symmetric vectors) inside the ball of radius r is expected to be 


v n • r n , 

and the ith shortest vector is expected to have the radius 



for n —> 00 


(3.7) 


(3.8) 


3.1.3 Comparing with random instances in experiment 

In practice, the lattices we deal with are mostly integer lattices. We presented in section 2.3 a simple 
algorithm to randomly sample from integer lattices of prime volume p. Once normalized with a factor 
of 1 / p ]/n , the uniform distribution over integer lattices converges towards the uniform distribution of 
random lattices of unit volume. 

Concluding from previous theorem, we have following properties concerning the N short vectors 
in random lattices of volume V. As n tends to infinity, we have: 
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• The variable defined by Ai (£)" • v n /vol (. C ) converges to the exponential distribution with param¬ 
eter 1 / 2 . 


• The expected value of Ai(£) converges to (1 — 7 / n) ■ (2 • vol(C)/v„) 


1 In 


The number of vectors with norms shorter than r ■ (vol(C)/v n ) 1/n is r n , taking into count both ±v. 


Comparing with the prediction given by Gaussian heuristic, which is (vol (£) /u ^) 1 , this expecta¬ 
tion of Ai is not exactly the same. There is a difference of a factor of (1 — 7 /n ')2 ] which is a slight 
difference when n is big. However, the expected number of points within given radius is the same as 
predictions of Gaussian heuristic. 

We also compared this distribution to the experiment data to observe the convergence speed ( 
Fig. 3.1 Fig. 3.2 and Fig. 3.3 ). We generate 1000 random integer lattices of dimension n, and com¬ 
pute their Ai(£) • (v n /vol(C)) 1/n value respectively Their cumulative histogram (solid line) is then 
compared to the cumulative distribution function of x 1 ^", where x is a random variable of exponential 
distribution of parameter 1/2 (dashed line). It is clear that for n > 10 these two curves are already very 
close to each other, which shows that the distribution of Ai {£) converges very quickly 


n = iu 



Figure 3.1: Lattice dimension n = 10 
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Figure 3.3: Lattice dimension n 





3.2. Enumeration procedure and analysis 


3.2 Enumeration procedure and analysis 


In this section, we present the classical enumeration algorithm, and its complexity analysis in worst 
case as well as average case. For clarity we summarize the definitions of notations that we are going to 
use in the table 3.1. 


Table 3.1: Extract of important notations for current chapter 


Notation 

Interpretation 

Ni 

The number of nodes in each layer 

N 

LUNi 

GH{£) 

Prediction of Ai(G) by Gaussian heuristic. GH(£) = (vol(£)/v n ) 1/n 

R 

The enumeration bound 

r 

R/GH(£ ) 

q 

«l|b?||/||b? +1 || 

For pruned enumeration 


Ps(*,R) 

Probability of finding a vector of known norm R 

Psucc (R/ R) 

Probability of finding any vector whose norm is smaller than R 

R 

Bounding function R = (Rf, R \,...) 

r 

r2 

normalized bounding function r = (■=£,...,) 

K n 


3.2.1 Algorithm description 

Given a basis B of dimension n, The enumeration procedure tries to find an integer combinations of the 
basis .r |b ] + • • • + x n b n , by going through the enumeration tree. The algorithm terminates when it has 
scanned through all possibilities, and output the shortest vector that it finds. Alternatively, if we are 
aimed to find a short vector within a certain bound R, or we have an upper bound on A ] {£), then we 
can as well terminate as soon as we find the first vector satisfying this condition. Since we already have 
an estimation of the shortest vector in a random lattice, setting up an upper bound with 99% confidence 
is not difficult. In this way the asymptotic complexity does not change, but usually we will be able to 
save time for some constant factor, the price is that we might fail with certain probability, or we will end 
up with a second-to-shortest vector, unless we really know the exact value of Ai (£) a priori, which can 
happen in some crypto-systems like NTRU. 

We adopt the algorithm description given by [29], where the program checks the enumeration tree 
with depth first search, and always output the first short vector with norm shorter than R. If ever 
one wants to search through all short vectors, it suffices to set R = \\b\\\ and modify the termination 
condition at line 11. 

The description in pseudo-code is given in algorithm 8. 
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Algorithm 8 Enumeration 

Input: A basis B and a bound R of short vector. 

Output: A vector v E /3(B) such that ||v|| < R, or failure if no vector in /3(B) is shorter than R. 
1: [M, D,0] <— GramSchmidtOrthogonolization(B) 

2: (xi, ..., x„) <— (1,0,..., 0) //current combination 

3: ( pi,...,p n , p n+ i ) <- (0,0,..., 0) //sentinel element p n+ \ = Oby convention 
4: (ci,..., c„) E- (0,..., 0) //Ci = E"=i XjFjj -1 
5 : (wi, ... ,w n ) E- (0,... ,0) //jumps from previous c/ 

6: last_nonzero E- 1 //last i such that Xj f 0 

7: while true do 

8: pi E- p,-+i + (x; + C;) 2 • ||bf|| 2 

9: if pi < R then 

10: if i = 1 then 

11: return E)=i xyby; 

12: else 

13: i i — i — 1; 

14: 

15 : X{ i — \ — C/J ) ZVj i — 1 

16: end if 

17: else 

18: k <r- k + 1 

19: if fc = n + 1 then 

20: return failure; 

21: end if 

22: if k ^ last_nonzero then 

23: last_nonzero k 

24: Xfc 4— Xfc + 1 //only enumerate positive half 

25: else 

26: if X; > c t then X/ xpVi else x, x; + x,- 

27: ZVi Wi + 1 

28: end if 

29: end if 

30: end while 


The enumeration tree consists of n levels. At level i , for 1 GJ i GJ n, we are working in the basis 
7tj (B) and we will look for nodes (x,-,..., x n ) such that the vector v, zr, x ; b ; j in the basis 7T; (B) has 

shorter norm than R. We use p, to mean |v, || 2 , and it satisfies 

Pi = ||x„7T/(b„) -4-1- X;7r;(b;)|| 2 < R 2 . (3.9) 

We can decompose B and calculate the GSO of the basis MDO = B, Then Equation. 3.9 can be rewritten 
as 

Pi = xl\\b*\\ 2 + (X n -1 + X n pn r n-l) 2 \\K-l\\ 2 ^ - f + L) x jH^J ll b fl| 2 < R2 ( 3 - 10 ) 

when the node in level i being visited satisfies equation 3.10, we go down to level i — 1 to visit its 
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children, by lifting the vector to 7T;_i (B). For each child, p,_i is calculated from p, by relation: 


Pi— 1 — Pi 


f-i + Ew 


]= i 



(3.11) 


Clearly, p n C p n -i ^ ^ pi is an increasing sequence. When a node has several children, the visit 

is done in the increasing order of pi- 1 . The increment from p, to p;_i is (xi-\ + Ylj=i x jPj,i-1 ) |b*_ j || 2 . 

Its minimum value is reached when we take x,_i to be [— Y!j=i x jPj,i-i\ ■ We note c,-_ | = T!j=i x jFj,i-v 
and by symmetry we suppose Cf-i > 0, then our searching order is 


x--! = \-Ci- lj, = f-Ci-iJ +1, x^ = f-C-iJ -1, x^ = f-Q-iJ +2, ... until p;_i ^ R 


,( 2 ) 


,( 3 ) 


( 4 ) 


(3.12) 

This order ensures the monotonous increasing of p,_i among all children of the same parent. Besides, 
x^ 1 * can be computed directly from by 


y 0 ’+i) _ 
x i -1 — 



~j 

+i 


if x-i\ > 0 
if xjE < 0 


(3.13) 


Thus the value of p,_i is completely computed from its father if it is the first child, or otherwise from its 
father and its proceeding sibling. In this way the memory we need is linear with the tree depth, which 
is 0 (h). There is one exception to the traversal of xp. in order to avoid checking both v and — v, we 
only check the nodes whose last non-zero component is positive. For this purpose, the index of the last 
non-zero component is flagged. 

Initially, we start by visiting the node (xi,..., x n ) = (1,0,..., 0). If we manage to visit a valid node 
in level i = 1, it means that we have found a vector whose norm is smaller than R, which implies the 
termination of the algorithm. 


3.2.2 Complexity analysis 

The running time T of this algorithm is proportional to the number of nodes in the enumeration tree N, 
and the time to process each node is almost a constant f no rfe- N is the sum of /V„ the number of nodes in 
each level of the enumeration tree. 

n 

T = N ■ t node = Y^Ni- t node (3.14) 

Z =1 

The number of nodes on level i is half of the number of vector point in the lattice 71/(B) with smaller 
norm than R. Therefore this can be estimated by Gaussian heuristic [29]. The volume of the projected 
lattice is 


voi(ni(B)) =rr ii b y ii 

i=i 

We can use Gaussian heuristic to predict N„ the number of nodes at level /, 

1 v n _ i+1 -R"- i +' 

1 ~ 2 Ylj=i 1 1 by || 


And the total number of nodes for the entire enumeration tree is N = Yl'i=\ N,-. 


(3.15) 


(3.16) 
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For random lattices, after a typical reduction algorithm (LLL or BKZ for example), l|bf||/||bf +1 ||« 9 
for some constant q, where the value of q depends on the algorithm. This approximation allows to 
simplify the previous estimation: 


_ R n - i+1 V n -i+ lllblll *'- 1 

( j(!- 2)(!- 1 )/2 c , 0 ;(£) 

Depending on the choice of R, this may yield different complexity: 
• If one takes R = ||bi ||, then (3.17) becomes 


(3.17) 


Ni 


q n(n 1)/2 -v n - i+ 1 
1)/ 2 

(n (n-l)-(i-2)(i- 1))/2 


Vn— j+1 


• If one takes R = ■ vol ( C ) 1/n , then = ©(\/h), thus (3.17) becomes 


Ni 


q (i-l)(n-l)/2. 2 0(d) 
1)/2 


(f-2)(n—f+l)/2. 2 0(d) 


If one takes R = GH ss ^/jne ' vol(£) 1/n , then (3.17) becomes 


Ni 


vol{C ) 


!! — ! + ! 


nj,, »b 
n;=, lit; 


n 


vol(C) l/n 

Ji-l)(n-i+2)/2 


n — i +1 
n 

n — i +1 
n 


(n-i+ 1)/2 


(n—i+l)/2 


H — / + 1 


(n—i+l)/2 


In all these case, the maximal value is achieved for i ~ n/2, then (3.17) becomes 

N/ « ^" 2/8 2° (n) , (3.18) 

and the complexity of the algorithm is super-exponential in n. 


3.3 Pruned enumeration for SVP 

Pruned enumeration was first introduced by Schnorr and Euchner as a subroutine in their reduction 
algorithm [87], However, the first correct analysis and efficient pruning enumeration is attributed to 
Gama et al. [29]. They provide an efficient way to compute the success probability and complexity of 
pruned enumeration, which leads to great speed-ups. 

A closer look at the equation. 3.9 shows that not each nodes in the enumeration tree actually have 
the different probability of containing a solution in its branches. In enumeration, p n £ • • • ^ p\ is a 
monotonously increasing sequence, and if pi is already very close to R when i is still big, then probably 
its branches have no child nodes within R bound, and this sub-branch ends up without finding any 
solution. 
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Consider a typical vector v of dimension n with norm R, define p ; - = Y\=i v }> then experiments con¬ 
firmed that most probably the sequence p,- increases slowly at an almost constant speed until reaching 
p n = R 2 . On the other hand, if p, sequence of a vector v increases much faster than i/n ■ R 2 , then the 
probability that ||v|| C R is small. 

Following is a figure depicting the typical pi sequence for a vector randomly sampled from the 
surface of hypersphere of radius R, dimension n = 40. We generate 1000 samples, draw their mean pi 
value and give standard deviation as the error bar. (Fig. 3.4) 



Figure 3.4: The mean value and standard deviation of p, for points randomly distributed over the unit 

sphere of dimension 40. 


The idea of pruning consists of discarding branches that are unlikely to produce a solution. This is 
to say, we predefine a sequence R = (R 2 ,..., R 2 ) satisfying R^ ^ ^ ^ R 2 = R as our bounding 

function. At level i of the enumeration tree, we narrow our boundary by only checking the nodes within 
bound R n -i+ 1 - In this way, we are actually enumerating nodes zr, V/b ; j G IR' !_1+1 , which is inside 
following set: 


C R 2 iR 2_. +i = | (z 1,... ,z n _f+i) G B” 1+1 , V; ^ n - i + 1 , ^ zf < R 2 1 . 

To introduce pruning into the algorithm, it suffices to change the condition (3.9) into 

Pi = \\x n 7Ti(bn) + ' ' • +X/7r;(b/)|| 2 < R^_ f+1 . 
and change line 9 of algorithm 8. 


(3.19) 


(3.20) 


3.3.1 Running time analysis 

We use Gaussian heuristic to predict the number of nodes in each level. We need to compute the volume 
of the convex set C R 2 R 2 for / = n, and it follows that 


1 vo, ( c *i . 

2 nu lit; 


(3.21) 
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For doing so, a naive way would be Monte Carlo method. That is to say, we uniformly sample a lot of 
points from a hyperball of radius R in R', and the probability that it falls into C R 2 R 2 is the proportion 
between the volume of the polytope C R 2 R 2 and ■ R\ the volume of the hyperball. 

A beautiful analysis by Gama et al. in [29] also presents a way to compute the exact value of volume 
of c R 2 R 2 under the constraint of R} = R\ < R 2 3= R l ^ ^ R-n-i = R-n- We cite their analysis and 

proof in the following. 

The distribution of the vector ( u\ + u\, 1/3 + u\, ..., u} , + w?) when u is uniformly chosen from 
Ball,, = {Ya =1 u } ^ 1} is given by a Dirichlet distribution with parameters (1,..., 1), which is simply a 
uniform distribution over the set of all vectors whose coordinates are non-negative and sum to at most 
1 (see Page 593 of [25]). More precisely, there is: 

Lemma 3.3.1. Let n = 2£ be even. Let R\ = R\, R| = R|,..., R}_ 1 = R} with 0 < Rq < R} < • • • R}_ v 
For any (fi,..., tf) G R> 0 , denote by V({t \,..., tf) the following polytope: 

i 

Ve(h,... ,ti) = {(*!,. ..,xf) 6E f s.t. Mi G {1,... ,£} Xj > 0 and J2 x j — h}- 

7=1 


And let (r lf r 2r ..., r n ) = (, |f,.,,, |f), Then: 

i\ n i\ n i\ n 

i 

Pr (V; G [1 ,n], Yu} < rf = i\ • volV((r 2r r if ...,r„_ 2 ,r„) (3.22) 

with: 

rh A 2 rh fte 

volVe(t lf ...,tt)= / / ... dy t ...dy 3 dy 2 dy 1 . (3.23) 

Ji/i=0 Jy 2 =yi 2y 3 =y 2 Jyi=Vi-\ 

Furthermore, the integral of (3.23) can be computed numerically asfollozvs: 

1. Let t \,..., t( G R fee yroen. 

2. Let Q 0 = 1. 

3. For i = £ downto 1 do: 

(a) Compute the unique polynomial Q G R[X] such that Q'(X) = Q^_,-(X) and Q(0) = 0. 

(b) Let Q e _ i+ 1 G R[X] defined by Q e _ i+1 (X) = Q(t { ) - Q(X). 

4. Return Q^(0) as volV({ti,.. .,t(). 

Proof. Recall that the distribution of the vector ( u\ + u\, u} + u\, ..., u 2 d l + uf) when u is chosen from 
Ballrf is is simply a uniform distribution over the set of all vectors whose coordinates are non-negative 
and sum to at most 1. This proves that: 


Pr 

~Ball„ 


V; e [bn], 


— r / 


7=1 


VOlVnll^Ui,. ■ -Jn-lJn) 

volV n/2 { 1,1,..., 1) 


Notice that volV n / 2 (1,1,..., 1) = 1 /f! because it is the volume of the standard simplex, which proves 
(3.22). 
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In general, computing the volume of a polytope is not easy, but our polytope has a special shape 
which makes it easy: 


volViit = 


rh~x i rt 3 -x 1 -x 2 


rk-LU x i 


'Xi=0 «/X2=0 JX3=0 

which by the change of variable y, = Ylj=\ x i becomes: 

ft 1 ft 2 ft 3 


'xe =0 


dx£ ... dx? > dx'i&x\, 


rh rn ft 3 f 

volV e (h,...,t e ) = / / ... / 

Jy 1= 0 Jy 2 = yi Jy 3 =y 2 J y 


>ye=ye -1 


... dy 3 dy 2 dyi, 


which proves (3.23). 

Finally, we note that this integral can be evaluated easily, by successive integrations giving rise to a 
multivariate polynomial in t \,..., tf. Numerically, it can be evaluated by the iterative process described 
in algorithm 9. 


Algorithm 9 VolumePolytope 

Input: Rj, ..., 

Output: The volume of the polytope defined in (3.23) 
l: C i — 1; //C £ R[X] is a polynomial 
2: for i = n, n — 1 ,..., 1 do 
3: C 4— C(t)dt; 

4 : C^C(R2/R2)-C(x); 

5: end for 
6 : return C(0) 


□ 

The cost for even layers in the enumeration tree can be estimated using the previous lemma, and 
the odd layers can be estimated by interpolation. For general R = (R 2 lr li\, . .., R 2 ), the cost estimate 
of the bounding R U pp er = (R\, R\, R|, R4, • • •) serves as an upper estimate, and the bounding R ] ower = 
(R 2 , RR 3, R3 ,...) serves as a lower estimate, as illustrated in figure 3.5. 

Figure 3.5: Upper and lower estimates the cost of a given bounding function. 
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Ni(R-lower) ^ M’(R) ^ Ni(Rupper)/ i — 2,4, ... . (3.24) 

Fig. 3.6 shows an example of estimating the number of nodes in even layers for enumeration with 
different pruning strategy. The number of nodes is shown in log base, and the pruning strategy cor¬ 
respond to full enumeration(no pruning), success probability 95% and success probability 25%. The 
colored zone of Fig. 3.6 is the distance between the upper estimate and the lower estimate for even 
layers. According the figure, these bounds are fairly close to each other, which suggests to estimate the 
total number of nodes by a loose and cheap interpolation, as done in Algorithm 10. 



Figure 3.6: Accuracy of the estimate for the enumeration cost. 


For odd layers, we approximate their N, values by averaging the neighboring N, for even layers, i.e. 

i =1 
L«/2j 

~ Xj N2i + (h/2i-l + Na+0/2 

i =1 
Ln/2j 

~ 2 ■ £ N 2l . 

i=1 

Here, the last approximations are due to neglecting the Nj terms for i = 1, n, and i = n — 1 if n is odd. 
For these layers, N, is usually very small. 

Thus we have algorithm 10 which computes efficiently an estimate of the enumeration cost for 
general R, which is the average value of the upper and lower estimates. It is easy to write an analogue 
of this algorithm that computes only the upper or the lower estimate. 
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Algorithm 10 Estimation of the enumeration cost 

Input: A bounding function R = (Rj < • • • < R 2 ), and the Gram-Schmidt squared norms 
||bj || 2 ,..., ||b*|| 2 of the input basis for enumeration. 

Output: Estimation of the total number of nodes t to be enumerated when using the Gaussian heuristic 
as the enumeration radius. 

1 : N <- 0; 

2: for i=2,4,. • • do 

3: fi 4- i ■ Rj + log(Of) — EfLn-i+1 log( ||b* || )}//log of cost of full enumeration within radius R; (level i). 

4: C; 4— 2(VolumePolytope(Rp R 2 ,... , R^-J+VolumePolytope^ 2 , R 2 ,..., R 2 )) 

5: Ni <r- (i/2)! • Ci • fillCi • (0! = Pr (Vf G [l,i], EL < *t) 

u~Balli 

6: N 4— N + N f ; 

7: end for 
8: return 2N 


3.3.2 Success probability for finding a single vector 

We want to investigate the probability of successfully finding a vector with a given pruning strategy. In 
this subsection, we first study the probability of finding a single vector v with fixed norm. By taking 
into consideration the density of vectors within a ball of radius r, we will extend our computation to 
the probability of finding any of the vectors within range. 

Let p s (||v||, R) be the probability that a vector of norm ||v|| can be found by enumeration with prun¬ 
ing R = (Ri,... ,R n ). The analysis of the basic case where ||v|| 2 = R 2 is first presented by Gama et 
al. [29]. And we naturally extend the computation to general ||v||. 

To define such success probability it is important to make clear our assumptions. The enumeration 
is a deterministic algorithm. When we fix a bounding function R, the result of each independent launch 
is identical. The "probability" here refers to the fact that we have no information of v in advance, es¬ 
pecially, we do not know the direction of v/ ||v||, although we are given ||v||. It looks to us as if it is 
a random vector uniformly distributed over the sphere of radius ||v||. In other words, the probabil¬ 
ity measures the event that such a random vector is found by our enumeration routine, assuming v 
is uniformly distributed over the sphere. Alternatively, if we are given many sufficiently random in¬ 
dependent input bases, then the event that the program outputs the target vector is also measured by 
probability p s . 

To be exact, we need the following heuristic assumption on the input basis: 

Heuristic 3.3.2. The distribution of the coordinate of the target vector v, when written in normalized Gram- 
Schmidt basis (b^/||bj||,.. . ,b*/||b*||) of the input basis, looks like a uniformly distributed vector of norm 

v. 


This heuristic says that in a random basis, the vector v is not oriented in any particular direction. For 
different basis B of a fixed lattice C, if it is sufficiently randomized and it is not too strongly reduced, 
then it will satisfy the randomness requirement in the assumption. Because the number of vectors of 
norm r • (vol(£) / v n ) 1/n is expected to be r n , there will be sufficiently many such vectors to choose from, 
as a result, the collection of these basis show similar behavior as random basis when we calculate p s . 

We first consider the basic case where ||v|| 2 = R 2 . Let r be the normalization of R, i.e.. 


r = (n,...,r„) = 


U ?R? 


R 2 n)' 
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Obviously p s (R 2 , R) = p s (l,r). We note p s (l,r) as p s (r) for short. 

Let v be the normalized target vector, and let x = (x\,... ,x n ) E R" be its coordinates in the or¬ 
thonormal basis (bj/ ||b| ||,... ,b*/||b*||). By definition, v belongs to the pruned tree if and only if for 
all A: = 1,..., n, Ylj=k x j ^ r n-k n • By Heuristic 3.3.2, x is uniformly distributed over the surface of the 
hypersphere of radius ||x|| = ||v||. We use S" -1 to denote the unit hypersphere in n dimension, then, p s 
can be estimated as 


Ps(r) = Pr (Vfc G [l,n],£] 
u-S - 1 


uj ^ 


-k+ 1) 


(3.25) 


Monte Carlo estimation 

As before, one can estimate this probability through Monte Carlo simulation. That is, we uniformly 
generate many points on the unit sphere, and count he number of points which satisfy the bounding 
function r. This is described in algorithm 11. 

A random point on the unit sphere can be estimated in the following way. First, its n coordinates 
V \,..., i'„are sampled from independent identical Gaussian distribution, then each coordinate is nor¬ 
malized to have unit norm. We use the property that when X is a centered Gaussian variable with 
variance 1, then X 2 conforms to y 2 distribution. This follows from the following theorem. 

Theorem 3.3.3. If X\, X2 ,..., X„ are i.i.d Gaussian variables, then the sum of their squares has the chi-squared 
distribution with n degrees of freedom. 

x 2 + x 2 + ...x 2 ~y 2 

Besides, chi-squared distribution is a special case of gamma distribution, so that X ~ y 2 implies 
that X ~ T(k/ 2,2). In practice, many computational software have integrated support for gamma 
distribution. Therefore we adopt the expression of gamma distribution in our algorithm description. 


Algorithm 11 Estimate success probability using Monte Carlo 

Input: A bounding function r = (n,... ,r„), and r\ ^ ^ r n . 

Output: An estimate of p s with Monte Carlo sampling over a sphere. 

1: t i — 0; // counter for successful samples 
2: for i = 1,... ,NUM_SAMPLES do 

3: s (GenerateRandomGamma(0.5,2))i x ^;//s = (si, ...,s n ) is a vector of length n, where each coor¬ 

dinate follozvs an independent distribution Gamma(0.5,2). 

4: C\ sp 

5: for j = 2,..., n do 

6: Cj Sj + Cj-i', //Cj 4— Ylk=\ s k 

7: end for 

8: if V; G {1,... ,n},c n ■ rj > Cj then 

9: t <— t + 1; // One more successful point 

10: end if 

11: end for 

12: return t / NUM_SAMPLES; 


Exact computation and approximate bounds 

In some cases we can compute p s ( r ) exactly. In fact, when a vector u is uniformly sampled from S n 
its first n — 2 coordinates (u\, ■.. ,w n - 2) is distributed uniformly over all vectors whose coordinates 
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are non-negative and sum to at most 1, then (u\ + « 2 ,..., w 2 _ 3 + m 2 _ 2 ) is the Dirichlet distribution of 
parameter (1,.. .,1). Therefore, when dimension n is even, and the bounding function satisfies the 
property r\ = r 2 ,r 3 = r 4 ,... ,r n -\ = r n = 1 , the precise value of success probability p s (r) can be 
numerically computed using equation 3.22, which we restate here: 


Pr (V; G [1, n- 2], £ u ? < rj) = ( ! • volVn^ (r 2 , r 4 ,..., r, ; _ 2 ) 
u~Ball„_2 “ \ A J 2 


(3.26) 


And for a general bounding function r, we can still give an upper estimations with r" = 
(r 2 ,r 2 , ••• JnJn), such that 


Ps(r) ^ p s (r") (3.27) 

However, r' = (n,ri,... ,r n _i,r„_i) does not provide a lower estimate for r n _i < 1, because 
there are always p s ( f/ ) = 0 in this case. When r„_i ~ 1, we get a lower estimation with r' = 
(ri,ri,... ,r n _ 3 ,r„_ 3 ,1,1). When the dimension is odd, we approximate p s ( r ) by p s ( r i/ r 2 / • • • 1). 

Algorithm 12 gives an upper estimate for a general bounding p s (r). 


Algorithm 12 Estimate success probability p s (r) 

Input: A bounding function r = (rq,... ,r n ) satisfying H < • • • < r n = 1. 
Output: An estimation of p s (r). 
l: r' (n,r 3 ...,r 2 . rn/2l _ 3 ); 

2 : r" <- (r 2 ,r 4 ...,r 2 . rn/2l _ 2 ); 

3: Plow T- - VolumePolytope(r") • ([n/2] — 1)!; 

4: Phigh VolumePolytope(r') • ([n/2] — 1)!; 

5: return (pi ow + Phigh)/ 2 


Extend analysis to general ||v|| 

For general ||v||, we have the following observation: when 11v11 2 > the pruned enumeration will 
never give the solution searched for, because the convex body defined by R is completely contained in 
the interior of ||v|j sphere. When 11 v || 2 < H 2 , then v can lie in the intersection of the ball of sphere ||v|| 
and the polytope defined R. In other words, if we define bounding R' = trim(R, ||v|| 2 ), where trim 
operation consists of substituting Rj by min(||v|| 2 ,R ( ): 

trim(R,x) = (min(x,Ri),min(x,R 2 ),... ,min(x,R„)), (3.28) 

then the probability that v is found by enumeration with bounding function R is the same with bound¬ 
ing function R r , i.e. p s (||v||, R) = p s (||v||,trim(R, ||v|| 2 )). Now R' satisfies R'\ = ||v||, and we are back 
in the basic case. 

In conclusion, p s (||v||, R) can be reduced to p s (r) for some r: 

( 0 if ||v || 2 > R n 

Ps (II v ||, R) = < Ps(l,r) if 11 v 11 2 = R n (3.29) 

lp S (ll v ||,trim(R, ||v|| 2 )) if ||v || 2 < R n 
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3.3.3 Success probability considering multiple vectors 

In a more general case, we only know the distribution of short vectors for the lattice basis before enumer¬ 
ation. We are not aiming to find a certain vector of specific norm, but more generally, we are interested 
in computing the probability of finding any of them, which we note as p s UCC (R, R). 

We approximate the result using finite element method, by discretizing R into s intervals. For each 
interval, we count the average number of vectors belonging to it, and compute their expected norms 
k(R, R + AR). Then, all vectors in the interval were considered to be of norm k, and they are assumed 
to be independently and uniformly distributed over the sphere of radius k for approximation. The 
probability to find any of them is 1 — (1 — p s (k, B))\ In the event that some shorter vector has already 
been found for some R' < R, we avoid repetition by only taking into account the previous vector, which 
adds a factor of 1 — p s UCC (R). Finally, we sum up probability over all intervals. The detailed description 
is given as follows. 


Finite element analysis 


We predict the distribution of vectors with Gaussian heuristic. Note that the rigorous analysis of short 
vectors in a lattice using Poisson process give identical results. So within radius R, the total number of 
vectors is (excluding the symmetric half): 


1J 2 vol(C) 


(3.30) 


And therefore between the interval [R, R + AR], the expected number of vectors is t(R + AR) — f(R), 
and their expected length is 


k(R,R + AR) 


n (R + AR)f(R + AR)-rf(R) 
nTT f(R +AR) - f(R) 


(3.31) 


Then the probability that any vector of norm between R and R + AR is found is approximately 


l-(l-p s (k(R,R + AR),R)) f W 


Fet p succ (R, R) be the probability that enumeration with pruning R finds any vector shorter R. Then, 
we can compute the probability of p S ucc ( R, R) inductively by increasing R by AR: 

p succ (R + AR,R) = p SU cc(R,R) + (1 - Psucc(R,R)) (l — (1 — Ps(k(R,R + AR),R)) f W) (3.32) 


Choosing intervals of R 

What we do here is to approximate the length of each vector by the expectation of norm over all vectors 
in the interval [R, R + AR], We have to choose AR carefully to ensure that this is a good approximation. 
We can always make AR small enough to avoid precision issue, but we try to avoid computing over too 
many intervals, in order to speed up computation. 

At first it might seem tempting to make the following choices for AR: 

• The most naive way is to choose AR to be a small constant, but in this way, the number of vec¬ 
tors in the interval [R, AR] grows exponentially when R increases, then k(R, R + AR) may not be 
representative enough for all vectors in that interval. 
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• On the other hand, choosing A R = 0(K l ' /, 'j can ensure that there will be a constant number of 
vectors in each interval, but the value of AR may be too big when R is small, making k( R, R + AR) 
unrepresentative again. 


These two examples represent the two extreme cases that should be avoided. Following is a way 
to generalize the strategy of dividing the interval [0, R]. There is a parameter v in this strategy. When 
v = 1, this amounts to choosing a constant AR. While for v = n, the strategy chooses AR = 0(R l /”), 
which is the second example given. Setting v E (1, n) gives an intermediate choice. 

Our method consists of dividing the interval [0, R] into s parts by defining the sequence 


Si 



VO ^ i ^ s. 


(3.33) 


where v E [1, n] is a parameter. 

Figure. 3.7 illustrates the s, sequences for different choices of v. The solid line shows f(R), 0 < R < 
1.05 • GH(£), which is the expected number of vectors with regard to radius in a typical 70-dimensional 
lattice. The x points, + points and o points signify v = 1,17.5,70 respectively. Notice that the majority 
of the vectors are expected to have radius in the interval [1,1.05]. For v = 1, the interval [1,1.05] is 
underrepresented, while for v = 70, the interval [0,1] is underrepresented. 


30 


25 


20 


t(r) - 

s, for v=70 o 

Sjforv=17.5 + 

Siforv=1 x 
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Figure 3.7: f(R), 0 < R < 1.05 • GH(C) for 70 dimensional lattice of unit volume (solid line). The dots 

show Sj sequence for v = 1,17.5,70 respectively. 


In order to choose a good v, we tested the speed of convergence of estimation by increasing s. After 
numerous experiments we found that n /8 < v < n/2 gives best performance in terms of convergence 
of estimation, therefore we decided to fix v = n /4 from now on. We also fix the total number of intervals 
to be s = 1000 in practice. Algorithm 13 is the pseudo-code for estimating p sUCC (R, R). By default we 
also note p SUC c(R) for jtw (Rn, R)- 
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Algorithm 13 Evaluate p S ucc(R) 

Input: a pruning function R = R \,..., R n . The radius of enumeration R 

Output: p succ (R, R) the probability that enumeration with pruning R finds a vector shorter than R. 
And E r the expected norm of the found vector. 

1 : p i — 0 , U i — 0 

2: for 1 = 1 ,..., 1000 do 

3: Si <- (//1000) 4/n R 

4: q 4r- (1 - p) (l ~ p s (fc(Si_i,Sf),R) t(si)_f(s, '- l) ) 

5: p +- p + q 

6 : U <- U + q ■ k(Si-i,Si) 

7: end for 

8 : Er <— u/p 
9: return Er,p 


3.4 Bounding function 

In this section, we will show how to find an optimal bounding function R for enumeration. Our esti¬ 
mation of T enum and p S ucc is precise only when /G, = Rii 1 i, thus we will only search for function which 
satisfies this relationship. Define R 0 dd(R) = {Ri,R 3 , ... ,R2[h/2]-i}- Instead of searching for optimal 
bounding functions in 7 Z n , where 7 Z n is the set of all possible bounding functions, we actually restrict 
our searching domain to 7£r n / 2 i = {R 0 dd(R)}- Recovering a solution from 7^.r„/ 2 i is straightforward. 

Besides, we are actually interested in the optimum bounding function modulo GH(C), which de¬ 
scribes the density of the given lattices. Therefore, hereafter we will be using r = R 0 ddR/ GH(C) to 
denote the bounding function modulo volume, and we search the optimal solution in space 
And in our computation of cost and success probability, we use r in the world of modulo GH(C ) in¬ 
stead of R. 

3.4.1 Optimization problem 

The idea of enumeration using extreme pruning by Gama et al. [29] consists of taking a relatively small 
Psucc (g R) for each enumeration, but randomize the basis and run enumeration many times. For about 
1 / p succ (r,R) enumerations, it is expected to find a solution vector within radius r. The problem of 
searching for best bounding function can be formulated as follows. 

Given a lattice £( B), suppose that randomizing and preprocessing this basis to reduce it to similar 
qualities as B will take To time in average, the expected running time for enumeration is T enum { R, B) = 
inode • N(R, B), where N(R, B) is the number of nodes enumerated with bounding function R. Here t no ^ e 
is the average time for visiting a single node, which is approximately 10 7 second for double precision 
in C++ implementation. The total time expected for finding a solution is 

T = T 0 + Ten«m(R,B) (3 . 34) 

Psucc v r R) 

An optimal bounding function enumeration with extreme pruning is a bounding function which 
minimizes the total running time T. This is to say, the optimal bounding function is the solution to the 
following optimization problem: 
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minimize 

R 

subject to 


Tenum (R, B ) + To 

Psucc(// R) 

Rodd(R) e V \ n/ 2] 


(3.35) 


There are numerical ways to compute the solutions to this problem, which will be presented in follow¬ 
ing subsections. 

The optimization problem is defined by three parameters: (To, B, r). Thus we note the solution to as 
R(To, B, r). Figure. 3.8 shows some example bounding functions as numerical solutions to optimization 
problems. Here n = 70, and the solutions are in 7space. 



Figure 3.8: The numerical solutions R(Tq, B, r) with a fixed B of n = 70, r = 1.05, and different Tq 


3.4.2 Numerical solutions 

Optimization problem of continuous variable is a classical topic. One possible method is to do a random 
search, using cross-entropy method for example. An advantage of random search is that it would finds 
the global optima with good chance. However, it suffers from the disadvantage of being very slow. 
Another classical way to search for optimal solution in optimization is to use Newton type methods, 
such as sub-gradient method or method of steepest descent. In contrast with the random search, this 
method is generally much more efficient, and has a good running time bound. However, there are 
two main disadvantages. First, they require the computation of first or second derivatives of the target 
function. This might make the computation numerically unstable, because we can only numerically 
compute the target function, then approximate derivatives by finite difference. Besides, it is possible 
that the method outputs a result which is a local optima, but not global optima. A similar method is 
golden section search. With this method, while we are not required to compute the first and second 
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derivatives, we still risk of falling into local optima. 

In our experiment we tried two different methods: random search using cross-entropy method, and 
golden section search. 

Cross-entropy method 

Cross-entropy method is attributed to Reuven Rubinstein [84], originally applied in domain of machine 
learning. Among all random search methods, one important trait of this method is that it tried to find 
random steps in an intelligent way, which is adaptive to the function values, which makes it more 
efficient. 

To do so, it does not only store the current optima R, but also maintain a list L of first N "good 
R's". It learns the variance of the elements in the list L. This variance is then used to generate new 
perturbations. Then we evaluate the perturbed R', R' is either updated to R, or added to the list L, or 
discarded. The algorithm continues iteratively the previous procedure, until the distribution of good 
perturbed R' in L has too little variance, which probably indicate that we are numerically very close to 
the optima. 

This method is described in pseudo-code in algorithm 14. 


Algorithm 14 Cross entropy method for searching R(To, B, r) 

Input: B = (||bj || 2 ,..., ||b*|j 2 ), the Gram-Schmidt squared norms of a lattice; 

To the pre-processing and randomization time. 
r the upper bound of radius of the target vector. 

Output: A bounding function R = (R lr ..., R\ n /i ]) which minimizes Tenu ™^ e / ^ +T ° 

1: R <- R 0 / /Initialize to any R 0 , for example, R <- (t^t, t, • • • )• 

2: I. < 0 

3: E 0.1 • (0.1,0.1,... ,0.1)|-„/ 2 ] / / Initial variance 
4: while || Sigma || > e do 
5: AR GenerateRandom (E) 

6: R' = R + A(R) 

7: if T total ( R') > T tota i (I/s last element) then 

8: Add R' to L. 

9: if T tota i(R') > T tota i(R) then 

10: R <- R' 

11: end if 

12: end if 

13: update E according to L. 

14: end while 


Some details of the algorithm: 

• E is initialized to a relatively big value, to allow a wide exploration range for random search. In 
our case, it is set to 0.1 ■ (0.1,0.1,..., 0.1) r„/ 2 i • In each step of update, instead of taking E to be the 
standard deviation of elements in L, it is better to add a term <7 t . This a t varies with the number of 
iterations t, in the beginning iterations, <7 f is relatively big in order to encourage exploration, and 
gradually this value is diminished, to ensure convergence of the search. 

• For efficiency of algorithm in practice, E is not necessarily updated for each change of L, it can be 
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updated every N/10 updates of L, for example. Especially, the first update of E should take place 
after the list L is filled with N elements. 


Golden section search 


As opposed to cross entropy method, golden section search optimizes R index by index. This method 
is guaranteed to find out the optimum solution of unimodal functions. A function /(•) is unimodal on 
interval [/, u] if it satisfies following property: if l < Xo < u is its minimal point, then /(•) is decreasing 
over [/, Vo] and increasing over [xo, u\. In general, if the target function is not convex, then it is possible 
that the method returns only a local optimum. 

The description in pseudo code is given in algorithm. 15 and 16. Algorithm 16 is the normal golden 
section search for single variable target functions. Algorithm 15 makes calls to algorithm 16 for each 
index of R iteratively, until no updates happen. 


Algorithm 15 Find optimized bounding functions for enumeration R(To, B, r) 

Input: B = (||bj || 2 ,..., ||b*|| 2 ), the Gram-Schmidt squared norms of a lattice; 

To the pre-processing and randomization time, 
r the upper bound of radius of the target vector. 

Output: A bounding function R 0 dd(R) = (Ri, • • ■ ,R\ n / 2 ]) which minimizes 

1: R 0 dd(R) Ro //Initialize to any R 0 , for example, R 4— (...). 

2: update 4— true 
3: while update = true do 
4: for i = 1,..., \n/2\ do 

5: p 4- minimize T tota i ((Ri,... ,R;-i,p,R I+ i, • • • ,R\ n / 2 ] )/B, T 0 ) 

P€[R;_i,R 1+ i] 

//Using GoldenSectionSearch (Alg. 16) to find optimal point of a single variable function 
6: if |p — r, | > e then 

7: n 4— p 

8: update 4— true; 

9: end if 

10: end for 

11: end while 


Following is a description of golden section search in optimizing a single variable function. In a 
nutshell, the idea is to tighten interested interval [/, u] until the length of the interval is smaller than e. 
In each iteration, we compare the value of /(f) and f(t') where t and t' are two symmetric points of 
golden section of the interval [l, u]. If /(f) > f(t') then we tighten the interval [/, u] 4— [f, u], else we do 
the symmetric. [77] 
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Figure 3.9: Illustration of golden section search algorithm: At each iteration, the initial interval is [/, u}. 
Depending on the comparison between /(f) and f(t') (case (a) or case (b)), the interested interval 

shrinks to by 0.618. 
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Algorithm 16 GoldenSectionSearch 

Input: /(•) the target unimodal function to be optimized. 

[I, u] the interested interval 
Output: f such that /(f) minimizes /(•) over [l, u] 

1: 

2: t <— <pi U + (fcl 

3: while u — l > e do 

4: if u — t > t — l then 

5: f' -F- (p\U + (f>2t 

6 : else 

7: t' <- t 

8: f i — (p\l ' <p2t' 

9: end if 

10: if /(f) > /(/) then 

11: l <- t 

12: f <- t' 

13: else 

14: U f' 

15: end if 

16: end while 


Using golden section search enabled us to find solution to the optimization problems faster than 
cross-entropy method. However, it is still quite slow. The time taken by finding a good bounding 
function is usually several hours on a machine of about 2GHz, in some case this is comparable to the 
time needed for enumeration itself. This is not satisfying enough, because we need to calculate a proper 
bounding function each time we want to perform a search of short vector for a new basis. 

To overcome this problem, we prepared a large amount of bounding functions R(To, B, r) for differ¬ 
ent dimension n, and different p succ - For each enumeration we can choose to use the bounding function 
whose parameter is the closest to our need. 
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However, when we observed these numerical solutions of the optimization problem, we have dis¬ 
covered several interesting properties, which help us to solve it more efficiently. 

Uniqueness of Solution We used several different optimization methods to find R, such as cross¬ 
entropy method and golden section search. Regardless of the initial value of R and the opti¬ 
mization method we adopted, the result converges to the same optimal bounding. This strongly 
suggests that this problem has only one global optima. And the problem is very likely to be a con¬ 
vex optimization problem, although we cannot yet provide a proof of the convexity of the target 
function. In practice, we can apply all existing methods for convex optimization to this problem, 
like golden section search. We did not use Newton-type methods to avoid numerical stability 
issues. 

Parameterization For various parameters (To, B, r), the bounding function R(To, B, r ) have very similar 
shapes, as shown in figure. 3.8. These bounding functions R(To, B, r ) dwell in a subspace inside 
'lZi n / 2 \, which can be analyzed by principal component analysis (PCA). We would hope to explore 
following relationship for R(To, B, r): 

R(To, B,r) = Ro + To • Ri + r • R 2 (3.36) 

so that all R(To, B, r) can be computed as linear combination of some known vectors. But this 
model is too simple to approximate the real case. In fact, the factors To, r, B are not necessarily 
linear in the previous equation, and they are not necessarily independent between one another. A 
more comprehensive model is 

R(T 0 ,B,r) = R 0 + /i(T 0 ) • R x + f 2 {r) ■ R 2 + / 3 (B) • R 3 + / 4 (T 0 ,r) • R4 + ... (3.37) 

here /; (•) are functions that we have to learn from the known bounding functions. 

Our goal is to reproduce or approximate R(To, B,r) using as few terms as possible. A good ap¬ 
proximation will serve as a good initial value for optimization algorithm, and finally we hope 
to reproduce R with high precision, so that we can generate it immediately when we need it for 
enumeration. 

Heuristic Cost The cost of the optimal bounding function T enum (R(To, B, r)) ~ To- This heuristic result 
is not directly visible from the problem statement in (3.35). However in practice, T enum (R,B) 
usually increase much faster than 1/p S ucc(R)/ and it is often the case that T enum (R(To, B,r)) ~ To- 
This heuristic is useful later when we want to compute R which make the enumeration run in 
some given time, for example during BKZ. 


3.5 Optimize and parameterize bounding functions 

The optimization problem of equation 3.35 finds a bounding function for enumeration with extreme 
pruning. This enumeration routine is used as a stand-alone algorithm to solve HSVP. However, we also 
want to apply the pruning on enumeration as subroutine of BKZ reduction. This problem is somehow 
different from previous one. 

BKZ requires a subroutine to find a shortest vector in a local projected lattice C{ B^q): given as 
input two integers j and k such that 1 < / < k < n, output v = (vj, ..., Vj ( ) E Z /c T 1 1 such that 
17r/(Ef=;' Vjbj) || = Ai(T(B[y q)). In practice, as well as in the BKZ article [87], this is implemented by 
enumeration which is a variation of Alg. 8. 
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Similar analysis and pruning techniques can be applied to this enumeration procedure. Especially, 
when block size is large enough (ft > 50), the behavior of the blocks in BKZ is very similar to that of 
random basis. However, there are two major difference between enumeration sub-procedure in BKZ 
and a stand alone procedure, namely, we have different constraint and optimization goal. 


1. Instead of hoping enumeration to successfully find out a short vector for each block, we can toler¬ 
ate more failure where certain blocks are left unchanged. In this case the optimization goal is much 
more complicated, because it is not evident how a single enumeration will effect the efficiency of 
BKZ reduction as a whole. 

2. BKZ makes a lot of calls to enumeration routine, and each enumeration is usually in shorter block- 
size than in a stand-alone enumeration routine. This means that we want to use yet less time to 
compute proper bounding functions for enumeration routine. 


Considering these differences, we need bounding functions that are solutions to different optimiza¬ 
tion problems, defined as follows. 


• One possible tentative is compute optimal R for a fixed r and p S ucc( r /R) = P- In this way the 
output quality of the enumeration can be anticipated. 


minimize T enum (R, B) 

K,dd( R )e^/2i (3.38) 

subject to Psucc(kR) = P 

Figure 3.10 shows examples of bounding functions in dimension 65, which are solutions to this 
optimization problem with r = 1, and various p, on the right is their enumeration cost for each 
level in the enumeration tree. 



0 5 10 15 20 25 30 35 40 


Figure 3.10: R(r, B, p) S TZ n /2 with n = 80, r = 1.03, and different p. 
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Figure 3.11: The number of nodes (Nj) in log basis for R(r, B, p) with r = 1.03, and different p. For 
comparison, we put also the curve for naive enumeration with r = 1, which is considerable higher 

than any of the pruned enumeration. 


• Another choice would be to compute optimal R for a fixed T enum (R, B) = T and r. In this way, the 
optimization problem becomes 

maximize p succ ( r > R) 

R odd( R ) 6 ^r«/2l (3.39) 

subject to Tenum(R, B) = T 

For both problems, we have to predefine p and r, or T and r, and it is not clear how to which 
values will lead to best efficiency of BKZ. 

Similar to the optimization problem in (3.35), we name the solution to these problems R(r, B, p) and 
R(r, B, T) respectively. 

3.5.1 Optimization problem with constraints 

The optimization problems with bounds as in equation (3.38) and(3.39) are difficult to resolve in gen¬ 
eral, because we have to search in the subspace defined implicitly by equations T enum (R, B) = T or 
Psucc (a R) = p respectively. We can use random search with relaxed constraint, in this way the search 
becomes even slower and we usually finish with a sub-optimal solution R. 

However, it is possible to find solutions to these problems by making several calls to the unbounded 
optimization problems of (3.35), by observing the following relationships between the unbounded opti¬ 
mization problems (equation 3.35), the optimization problem with constraint on r and p (equation 3.38), 
and the optimization problem with constraint on r and T (equation 3.39). 
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• If R(To,B,r) is a solution to (3.35), then it is also the solution to (3.38) with r,p = p succ (R(To,B,r)) 
and B, and is also the solution to the problem (3.39) with r,T = T ermm (R(To, B)) and B. This is to 
say. If we have an oracle to the solutions of optimization problem (3.35), then we also know the 
solution to problems (3.38) and (3.39) for certain T or p. By making several request for solutions 
to the problem (3.35), we hope that T or p will gradually approach our intended value. 

• Moreover, as we mentioned before, since T 0 is a heuristic approximation to T enum (R(To, B,r)), it 
is very likely that R(To, B, r) is very close to the solution of problem (3.39) with T = Tq. This gives 
a good initialization value for searching. 

We can use the following algorithm to solve problem (3.39) with an oracle to problem (3.35), as given 
in algorithm 17. It works in practice for most of the time, though there is no guarantee of termination 
in theory, and may diverge under certain cases. 


Algorithm 17 Optimization for R(r, B, T) 

Input: (B, T,r) 

Output: R with T enum (R, B) = T which maximizes p succ (R , r ). 
1: T 

2 : R <- R(T',B,r) 

3: 5 T e num(R/B) — T 

4: while 5 > e do 

5: T' <- T - 5 

6: R <— R(T', B,r) 

7: S< T enum (R, B) — T 

8 : end while 


3.5.2 Principal component analysis 

In this subsection we introduce principal component analysis and apply it to the solutions to optimiza¬ 
tion problems. In fact, it is hard not to notice that the solutions to the optimization problems have very 
similar shapes. Although they all belong to 1Z r„/ 2 i, we believe that they dwell in the some subspace of 
7T„/2 in with much smaller dimension than |"n/2]. 

We will first find out the dimension of its subspace d, then try to build a model with d terms so that 
R can be expressed by linear combination of d known vectors. 

Principal component analysis is a statistic tool which is commonly used in image processing for 
finding patterns in data of high dimension. Depending on the field of application, it has also a lot 
of other names. Here we borrow this concept because we are also in the situation where we want 
to identifying patterns in data, and highlight their differences and similarities. It is a non-parametric 
method, which can be viewed as an advantage since we do not have to provide any additional training 
information, or as an disadvantage, because once we obtain some observation and it is impossible to 
incorporate any assumption to the model. Therefore, PCA is only used here as a first step, to learn some 
important features of the data, such as the number of its main feature vectors, etc. Later we build our 
model with least square method. 

PCA is defined as the process of orthogonal linear transformation which transforms the data X = 
(xi,... ,xn) to a new coordinate system. In this new coordinate system, the first coordinate W| (called 
the first principal component) represents the direction where X has greatest variance, and w, is the 
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direction where X has z-th greatest variance. The graphic representation of this process is illustrated in 
figure 3.12, which gives an example in 2 dimension. 



Figure 3.12: Illustration of PCA. Data points have greatest variation on direction Wj, and second 

maximum on W 2 


More precisely, let C be the covariance matrix of data X of dimension m, 

C = cov(X) 

and let W = (wj, W 2 ,..., w,„) be its eigenvector listed in the increasing order of corresponding eigen¬ 
value. Then wj corresponds to the first principal component, and w, corresponds to the z-th principal 
component, so on. 

The next step consists of selecting important principal components and leave out the unimportant 
ones. This is how dimension is compressed by PCA. Generally, the order of eigenvalue shows the order 
of significance of corresponding eigenvector, and one will pick up d vectors (d < m) which correspond 
to d largest eigenvalues. Later, all data points can be derived in this compressed d-dimension space: 

x; « x- = x + CiWi + c 2 w 2 -t-F c d w d , (3.40) 
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where x is the mean over dataset X, and C; = (x„ w d ) is the length of the projection of x, — x on direction 

w 

The difference between x, and x' is controlled in a statistical sense: 

x; - X- = c d+ iw rf+ i + c rf+2 w rf+2 H-f- c„w„, (3.41) 

where q+i,... ,c„ are expected to be small, since cov(X) has small eigenvalue for eigenvector w t /. j. 


3.5.3 Data acquisition and analysis 

In order to apply PCA to our problem, we need abundant amount of solution bounding functions. 
This is done by solving a lot of instances of the optimization problem numerically using golden section 
search. Among the 3 parameters which determine R(To, B, r), we start by fixing two of them and vary 
T 0 . 

We set r = 1.05 and B = B70, where B70 is a typical BKZ-30 reduced basis of dimension 70, let To 
be {30,10,3,1,... ,0.003,0.001}. The illustration of the data is given in figure 3.8. Remember that the 
definition of R here is R 0 dd(Ri/ • • ■ ,R^), here we are actually doing PCA and least square method over 
v/R(To, B, r). We call this data set 7^b 70/ i.05 for short. 

We compute the covariance matrix of the data set, and let Ri, R 2 ,... be the principal components 
listed in the order of importance. In addition, we note Ro to be the mean value over data set. 

The 5 largest eigenvalues of the covariance matrix of 7Tb 70 ,i .05 is 

0.36, 1 x 10~ 4 , 8 x 10~ 8 , 7 x 10^ 9 , 1 x 10“ 9 . 

After the second value, the following eigenvalues are almost negligible. Therefore we only select Ri 
and R 2 as principal components. 

Figure. 3.13 shows the principal components Rq, Ri, R 2 for 77 b,i.o5. 



Figure 3.13: Principal components Rq, Ri, R 2 for 7 ?-b,i. 05 - 


3.5.4 Parameterize bounding functions 

PCA managed to extract from data the most important feature vectors. But it does not provide in¬ 
terpretations on the extracted data. Nor does it relate coefficient c the with the parameter Tq for the 
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solution. We compare the coefficients C \ and C 2 of 7^b,i.05 with their corresponding ln(To) in figure 3.14 
and figure 3.15. 



-8 -6 -4 -2 0 2 4 


ln(T) 

Figure 3.14: C\ for Principal components Rq, Rj, R 2 for '7 ^-b, 1 . 05 - 



-8 -6 -4 -2 0 2 4 

ln(T) 

Figure 3.15: C2 for Principal components Rq, Ri, R 2 for 7?-b,i.05- 


Assume Ci and C 2 to be functions of In (To), then observation indicates that C \ and C 2 are at most 
quadratic of ln(To), i.e. we expect that we can express vectors in 7 ^b,i. 05 i n the following form 

R = R 0 + Cl (ln(To)) • R a +c 2 (ln(T 0 )) • R 2 , (3.42) 

where ci(-) and C 2 (-) are polynomials of at most degree 2. 

By collecting the linear and quadratic terms of ln(To), we can rewrite equation 3.42 as 

R = Rq + ln(T 0 ) • R) + (ln(T 0 )) 2 • R', (3.43) 
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where Rq, R) and R{ are linear combinations of Ro, Ri and R2. 

Notice that for equation 3.43, Rq R} and R 2 can be computed by least square methods. In fact, 
equation 3.43 amounts to the following. 


/R (Ti)\ 


(1 ln(Jx) /h(T!) 2 \ 

R(T 2 ) 

= 

1 ln(T 2 ) /h(T 2 ) 2 

W«)/ 


\1 HT n ) Zn(T„) 2 / 



(3.44) 


Therefore we have 



(1 Zn(Ti) 
1 ln(T 2 ) 

\1 ln(T „) 


ln{Ti) 2 \ 

ln(T 2 ) 2 


/ R ( T i)\ 


ln ( T n) 2 J i e ft \R(T„)/ 


(3.45) 


where the Aj ^ 


is the generalized left inverse which is defined as 


(A t ■ A)~ 1 A t . 


3.5.5 Adding variable of r 


We already know how to represent 7 ^b,i. 05/ and in this section we add the variable r into consideration. 
As before, we can first determine the number of principal component by PCA and then solve the model 
by least square method. 

We still fix B = B 70 , and let let T 0 be {30,10,3,1, ...,0.003,0.001}, and r be {1,1.01,1.02,..., 1.1}. 
Figure. 3.16 shows some examples of R for Tq = 1 and different r. 
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t 0 = i 



Figure 3.16: Example of R o dd(R(Tb/B70/ r )) with Tq = 1 and different r. 


For all solutions R(r), we have R n = r. Therefore at least one component of R is linear in r. PCA 
shows that in this data set we have 3 significant components. Thus we should add a linear component 
r ■ R 3 into the extended model. As a result, we have the following model 


R = R 0 + r Ri + ln(To) • R 2 + ln(To ) 2 • R 3 . 


(3.46) 


The parameters r is usually ~ 1, therefore we substitute variable r with (r — 1). 


R = Ro + (r - 1) ■ Ri + ln(T 0 ) • R 2 + ln(T 0 ) 2 • R 3 , 


(3.47) 


Then, Ro, Ri, R? and R 3 are computed from least square method. Their respective shapes are illus¬ 
trated in figure 3.17. Notice that R 3 curve is not as smooth as previous vectors. This may be because its 
small scale makes it especially vulnerable to quantification error. 
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R 0 R-, 




R 2 


r 3 



Figure 3.17: Rq, Ri, R 2 and R 3 for linearly reconstructing bounding functions in 7 ^b 70 


Finally, we are able to reconstruct R(To, B 70 , r) with this model. In theory. To and r can take arbitrary 
meaningful value, but taking values in the interval To S [0.001,30] and r S [1,1.1] has guaranteed 
stability. 

Figure 3.18 shows a comparison between a reconstructed bounding function R(To, B 7 o,r), and a 
bounding function given by numerically solving the optimization problem. Their difference is very 
small and hardly visible. 

3.5.6 Consider different basis 

Our discussion above have assumed a fixed basis B 70 , which is a typical BKZ-40 reduced basis of dimen¬ 
sion 70. In reality, we may need to perform enumeration with different quality. And for different input 
basis, it is possible that the same bounding function will give different enumeration time. However, we 
would like to make following remarks: 

• The bases may have different quality, which result in different enumeration cost. However, for 
these different basis, one bounding function provides similar proportion of speed-ups. This is 
because the enumeration tree has the biggest number of nodes around layer [n/2], and a given 
bounding function always prunes Ny n / 2 ] by a fixed proportion. 

• It is well known that the enumeration of a basis of bad quality can be made more efficient by per¬ 
forming a reasonable reduction prior to enumeration. This is discussed in a more rigorous sense 
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Figure 3.18: Comparison between optimization solution(solid line), and linear reconstruction(dashed 

line) of R(To = 30, I> 7 o,r = 1) 


in section 5.1 for the discussion of optimal reduction sequence. The optimal strategy of reduction 
and enumeration will always assure that the enumeration with two of the three parameter given 
among r, p succ , T, will accept a basis whose quality is not too bad, otherwise it would better to 
reduced a little more before enumeration. 

Consequently, we do a change of variable to previous parameterization again. Because it is the final 
enumeration time T that is more interesting to us, not the "preprocessing" time To, we are motivated 
to replace ln(To) by a descriptor of log of pruned proportion of bounding function. This descriptor is 
defined as: 


M 


N 


(full) 


max 

l<i^n 


N; 


(R) 


In this way, the pruned enumeration with R should be approximately 2 M times faster than the naive 
full enumeration with r = 1 . 

The modal is changed accordingly: 


R = R 0 + (r — 1) • Rr + (M - n/4) • R 2 + (M — n/4 ) 2 • R 3 , (3.48) 
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here, we used the term M — n /4 instead of variable M because frequent choices of M are ~ n /4. Again 
Ro, Ri, R2 and R3 are computed from least square method. Their respective shapes are illustrated in 
figure ??. 


R 1 R 2 






Figure 3.19: Comparison between optimization solution(solid line), and linear reconstruction(dashed 

line) of R(Tq = 30, B 7 o,r = 1) 


3.5.7 Extend model for different dimensions 


Discussion in previous subsections applies for R with a fixed dimension 70. The same analysis is also 
applied to construct models for dimensions n = 50,60,70,80,90,110. Using these models, we can 
conveniently compute an optimal bounding function in these dimensions. 

For intermediate dimensions, we interpolate the models of the two neighboring dimension with 
known model. For example, we want to compute bounding function in dimension k, where 50 < k < 
110 is not a multiple of 10. But we have the model for dimension k\ = 10 • \k/ 10] and k 2 = 10 ■ \ k/ 10]. 
We interpolate these two models to create a model for dimension k in the following way: 


R 


(*) _ 


h - k 


int{R.f 2 \k) 


k k2 . 1 \ 

int( R> ,k) 


(3.49) 


k\-ki 1 ki - k 2 

for i = 1,2,3,4, and the funtion int((Ri ,..., R t j ), k) is defined as a vector of length k, and each of its z-th 
coordinate is 

(m - m run + ii-\j- 11 )R\jv ( 3 - 5 °) 


where j = 1 + 


( d _i)(f_i) 

ic-l 
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BKZ Reduction and Simulation 
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For many applications in cryptanalysis, BKZ algorithm is the most practical algorithm for lattice 
reduction in high dimension. It was proposed by Schnorr and Euchner in 1994 [87], in an application 
to solve subset sum problems. There was a minor improvement [89] with the notion of "pruning", but 
this pruning is not optimized, and there is no analysis of it. The first implementation is given as part of 
NTL library by Shoup [91,92], However, its practical performance is little known. 

In this chapter, we first give the basic algorithm description. The original BKZ reduction has a 
parameter ft, namely, the blocksize. And it was considered that for ft > 25 reduction would no longer 
be practical [28]. However, thanks to the idea of extreme pruning, we are able to go over this limit. 
We also present an analysis of BKZ, building upon the analysis of enumeration in the previous chapter. 
Based on this analysis, we propose a simulation algorithm which predicts the running time and quality 
of the output basis. When blocksize is big, say /i > 50, the time spent in enumeration will dominate 
the time cost of BKZ algorithm, thus the running time can be approximately considered as the sum of 
enumeration time. 

We also introduce many improvements to BKZ reduction, including the new enumeration routine 
introduced in chapter 3, applying preprocessing in each block prior to enumeration, and many others. 
The new BKZ 2.0 algorithm accepts blocksize as high as 90 or even more, making it the state-of-the- 
art [19]. The new algorithm needs more parameters for preprocessing and pruning. Building upon 
many experiments, we computed a comprehensive table consisting of optimal reduction parameters 
for blocksize < 90. 
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4. BKZ Reduction and Simulation 


4.1 Original algorithm 

4.1.1 Algorithm description 

BKZ algorithm is an intermediate hierarchy between LLL and HKZ reduction in the following sense. 
It takes blocksize fi as parameter. When j 6 = 2, the output basis is LLL-reduced, whereas if ft = n, the 
output basis would be HKZ-reduced. For all intermediate values, the bigger /3 is, the more reduced the 
output basis is, but also the longer time it takes. 

One important characteristic of BKZ algorithm is that it runs by rounds. There is a sliding window 
of width (5, which travels from beginning to the end. When it touches the end, this is the end of a round. 
It continues with a new round, until there is no more reduction in an entire round. The description in 
pseudo code is given in algorithm 18 and 19. 

In OneRoundBKZ, the beginning and ending index of the sliding window j and k are increased in 
each iteration, until they reach the end n. Using enumeration we find the shortest vector in C(B hk j ). Let 
v be the integer coefficients of the basis vectors (by,..., b/, J such that 7 Ty(v • (by,..., b/,)) is the shortest 
vector in C{ B^j). Let b new denote the vector v • (by,...,b/j. Then, if b new = b\, then b new is placed 
between by i and by and a new basis is generated from this generating set, keeping b new on the j-th 
place. Finally, LLL reduction is applied to the next block. 

Upon termination of the algorithm, the output basis is guaranteed to LLL-j 8 reduced, and satisfying 
the following condition: 

||b*|| = Ai(£(B[ ! y min ( ! - + ^_ 1/ „ ) ])) (4.1) 


4.1.2 Hermite factor of output basis 

In this subsection we tried to establish an analysis based on average performance of BKZ on random 
basis. 


Running time 

No good upper bound on the complexity of BKZ is known. The best upper bound known for the 
number of calls (to the enumeration subroutine) is exponential (see [44]). However, as Hanrot, Pujol 
and Stehle pointed out, after a polynomial number of calls to enumeration, the quality of the basis is 
already very close to the final output. 

The cost of each enumeration is super-exponential in the blocksize j 8 , therefore so is the BKZ algo¬ 
rithm. 


Algorithm 18 Block Korkine-Zolotarev (BKZ) algorithm 

Input: A basis B = (bi,..., b„ ), a target Hermite factor d>o or target half volume ;/q. 

Output: The basis (bi,...,b„) is BKZ-/5 reduced 
1 : (B, U) LLL(bi,..., b„, U );// LLL-reduce the basis, and obtain Gram-Schmidt Orthogonalisation U 
2 : f successful 
3: while f =successful do 
4: (B,U,f) OneRoundBKZ(/3,B,U) 

5: end while 
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Algorithm 19 OneRoundBKZ 

Input: A basis B = (bi,... ,b„), 

the Gram-Schmidt triangular matrix U and ||b*|| 2 ,..., ||b* |j 2 . 
a blocksize j6 G {2,... , n\, and enumeration radius r and T. 

Output: The basis (bi,...,b„) after one round of BKZ-/1 reduction, 

and flag f marking if any enumeration had been successful during this round. 

1 : /' <r- 0 ; 

2 : f G- unsuccessful; 

3: while j < n — 1 do 

4: ] <— (/ mod (n — 1)) + 1; k <— min(y + (S — 1, n); // G- min(k + 1, n); //define f//e local block 

5: v <-Enum(//^], ||bj|| 2 ,..., ||bj*|| 2 ); //find v = (vj,...,v k ) G Z ^' +1 - 0 s.f. || 7 T/(Ef = yU;b ! -)|| = 

A i( L [;T]) 

6 : if v 7 ^ ( 1 , 0 ,... , 0 ) then 

7: f-G- successful; 

8 : LLL(bi,..., Yd=j v t bi, bj ,..., b,„ U) at stage /; //insert the new vector in the lattice at the start of the 

current block, then remove the dependency in the current block, update p. 

9: else 

10 : LLL(bi,..., b/j, U) at stage h — 1; // LLL-reduce the next block before enumeration. 

11 : end if 

12 : end while 


Upper bound for output quality 

As for the output quality of the basis, in the original work of Schnorr and Euchner [87], they proved 
that for BKZ-/3 reduced basis, ||bi || has an upper bound given by: 

||bi|| < Ypf 1 ■ vol(C) 1/n 

The study by Gama and Nguyen also pointed out that with the same technique, we are able to prove an 
upper bound of 

IIbi|| < -vol{C) 1/n 

In practice, the quality of bases output by BKZ is better than the best theoretical worst-case bounds: 
according to [28], the Hermite factor for n-dimensional BKZ-/I reduced lattices d(B) seems to quickly 
converge as n grows to infinity, whereas theoretical upper bounds are c'(f >) is significantly larger than 
c{f>,n). For instance, c(20, n) « 1.0128 for large n. 

By the end of this subsection, we will list all lim c(B,n ) for 50 < j8 < 1000. Especially, we will 

ft—>• + OO 

prove the following 

/ B i\TFT 

nS?oo C tf' n)= ' (42) 


Reduction blocks and random lattices 

To predict the output quality of a BKZ-reduced basis, we need some assumption on each block of re¬ 
duction. 

Intuitively, the first minimum of most local blocks looks like that of a random lattice of dimension the 
blocksize: this phenomenon does not hold in small blocksize < 30 (as noted by Gama and Nguyen [28]), 
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but it becomes more and more true as the blocksize increases, as shown in Fig. 4.1, where we see that the 
expectation and the standard deviation of seem to converge to that of a random lattice. Intuitively, 



Figure 4.1: Comparing for a non-extreme local block during BKZ-/3 reduction, with a random 
lattice of dimension (5. Expectations with and without standard deviation are given. 

this may be explained by a concentration phenomenon: as the dimension increases, random lattices 
dominate in the set of lattices, so unless there is a strong reason why a given lattice cannot be random, 
we may heuristically assume that it behaves like a random lattice. 


Inductive analysis on Hermite factor 

Let {bi,... ,b„} be the output basis of BKZ algorithm, and {bp... ,b*} be its Gram-Schmidt orthogo- 
nalized basis. We also note /; = log(||b*||), = log(vol(C(B^ ))) and Li = log(uo/(£(B [;>] ))) for 

simplicity. 

For convenience we also define following values: 


Local hermit factor hj, VI C i C n. It is the log of root Hermite factor for each block B 


ki l0S ^ b ^) min(/3,n — / + 1) 


— 90— 










4.1. Original algorithm 


• Global hermit factor gi, VI ^ i < n. It is the log of root Hermite factor for each block B[, „]. 


® = ^7 ( log(l|bnl) “ i7^7TT L M) 

tzl I tzl y I 

n-i + 1 ' (n — i)(n-i + 1) ; 


Local Hermit factor h t is directly linked with our assumptions over enumerations, while g\ the global 
Hermite factor is what we are attempting to analysis. Ultimately, the root Hermite factor of the whole 
basis is 

*(B) = (j|bi||/uo/(L) 1/n ) ' = exp(gi)"^- 1 ). (4.3) 

In the following, we trying to build a relationship between g\ and hj. 


Lemma 4.1.1. We can calculate gifrom hi from induction by the following relationship: 


gi = 


t± h - 

JS_, n-i-p+1 
n—i+l"" 1 ' n—i+1 




>+; 


n — f> + 1 ^ i < n 
1 ^ i ^ n — f 


Proof. BKZ reduced basis exhibit an inductive structure, such that if B is BKZ-reduced, then B yi r n],Mi 
are reduced too. Our analyses are based on this induction. For n — f> + 1 ^ i ^ n, the blocks span 
between indices i and n, therefore gi = (f — 1)/ (n — i) ■ //,. For 1 V i V n — f, the blocks span between 
i and i + f>, and gi have to be induced from hj and gi+i, ■ ■ ■ ,gi+p- 1 - 
Notice hj and gi,gi+\, ■ ■ ■, gi y -1 satisfy following relationship: 


( hi \ 


( P> 

J8-1 

1 

7. 

(8-1 

o 

1 

is-i 

gi 

gi+ 1 


n—i+1 

(n—i) (n—i+1) 

(n—i)(w—i+1) 
(8-1 


(n—i)(n—i+1) 

is-i 



n—i 

(n—i—l)(n—i) 


(n—i— l)(n—i) 

\gi+P~ 1/ 


V 



i?-i 

is-i 




n—i—jS+l 

(n—i— P)(n—i— |S+ 


\ (}■ \ 

k+i 


U+i s-i 


V U+n / 

(4.4) 

Assume Cq, Cj+i, ..., c !+ ^_ \ to be the coefficients of induction so that the induction is of the following 
form: 

( 8-1 

gi = Cq ■ hi + c i+j ' gi+j 
7=1 


which is equivalent to 


/ ^ \ 

gi +1 


gi — ( c 0 Q+1 • • -/Ci+l 3—l) 

\gi+p-lj 

We now compute cq, c l+ \,..., c,- +J g_i. They are the solutions of the following system 
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3-1 


3-1 


3-1 


n—i+1 (n—i)(n—i+ 1) (n—i)(n—i+ 1) ' ' ' (n—i)(n—i+l) 

(t± _I 

1 ° p 

t± P- 1 

n—i (n—i—l)(n—i) 

I 

= (co C/+1 ... Ci +I g-i) 


V 


l-l 

(n—/—!)(«—i) 


£zl , 
-*'- 0+1 / 


This is a triangular system, which can be solved by forward substitution. Upon some computation, we 
obtain the solution as follows. 


/ C° \ 


Cf+1 


c Z+2 

= 

\Q+j3-l/ 



( 


n—i+1 
n—i—[3 +1 

{p-l)(n-i+l) 
n—i—(3+1 
(P~l)(n~i+ 1 ) 


. n-i-g+1 . 

\ (j8—l)(n—i+1) / 


(4.5) 


Which proves the induction formula in the lemma. 


□ 


By developing this induction and collecting the terms of hj, we can write g\ as linear combination of 

hi s, 

n—1 

gi = YL kihi ( 4 - 6 ) 

i =1 

And we note the sum of the coefficients k[ to be K. 

K = E k t . (4.7) 

i=i 

In the following lemma, we study the properties of fc,, which describes the contribution of //, to gi. 
Lemma 4.1.2. We /zaue the following property for k t . 

1. k\ = /3/n. 

2. Vz > I,/:,- < e • ■ Particularly, k, < e/n. 

3. W K < 1 + T/ze Zorzzer bound is reached when n = f>. In addition, there exists a constant c m 5.96 
such that limjj-j.oo ffp ~+ c. 

Proof. By developing the induction, we discuss Zc, in two cases: 
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• If 20 ^ n, ki is given by following induction. 


k\ = 0/n 

, - & n ~ ® 

2 n(fi — l)(n — 1) 


Vi,2<i<0, 


Vz,/l + 1 sj i 5$ ?z — 0 + 1, 


Vi, n — 0 + 2<i^n — 1, 


ki = k^ i 


ki = fc/_i 


ki = kt-i 


P 

P ~1 

P 1: 1 

0 - 1 1-/5 0-1 

n — i n — i 

n — i + 1 0(n — i + 1) 


• If 20 > n, kj is given by following induction, 

h = 0/n 

0(n-0) 

2 n(0 — l)(n — 1) 


Vi, 2 < i ^ n — 0 + 1, 


Vi, n - 0 + 2 < i < 0, 

Vi, 0 +1 ^ i < n - 1, 


ki = k ^! 


ki = ki-i 


ki = k^ i 


IS 

0-1 


n — z‘ + 1 
n — i 


n — i 
n — i + 1 


ki-p 


n — i 

0 (n -i + 1) 


Here, max ki is reached by kp: when n ^20, 


kp 


P Y~' n-p 

0 — 1/ n(n — 1) 



/s-i 


H — 0 

n{n — 1) 


< e • 


H — 0 

n{n — 1) 
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when n < 2/3, 


k P , = 


P \ P P 


= 1 


^ 1 


,/3-ly 

ft( 

1 N 

n-p 

1 + „ 


/3-1, 

1 

1 x 

/- 1 

1 + „ „ 


P~ 1, 

1 


P 


n(n — 1) 

P 


< e ■ 


P 


ft (ft — 1) 


To conclude, kj < e ■ max (fr” ft) ^ e / n _ 

' 1 n(n— 1) ^ 


□ 


Following illustrates fcf, i > 1 for fixed ft and varying j8, (ft > 2/1). The sequence is upper bounded 

bynr^Tv 


Figure 4.2: kj, i > 1 for n = 100 and P = 2, 10,35,65,80 



And we have lim K = 1. More precisely, we have K = 1 + 0(/> 2 /ft 2 ), and following figure illustrates 
the speed of convergence of K with fixed /3 = 30 and increasing n. 
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4.1. Original algorithm 


Figure 4.3: K(f, n ) for f> = 30 and n from 30 to 130 


b=30 



Corollary 4.1.3. If for 1 < i < n — 1, hfs are i.i.d variables with E(hj) = h and Var(hj ) = a 1 , then E(g i) = 
hi + (K - |)fc, and Var(gi) < 

Remark. Here is some remarks about this corollary: 

• hi has a much greater influence on gi than any other hj: 

• Hgl) = Kh. 

• Assuming Gaussian heuristic, h « vj 1 ^, using Stirling approximation, zve have 


lim £(yi) = h = v n 


—l/n 


2ne 


■ (nfi) 2 ?. 


In other words, for a BKZ-f> reduced basis of random lattices, zve have 

l/n 


lim £(B) = lim 


b i| 


00 \ZWl(C) 1/n 


(JL 

\2rce 


{npy 


1 \ 2 ( 0 - 1 ) 


1 


(4.8) 


• If for each enumeration, the returned vector is the solution to HSVP 7 , which means h 
similarly zve have 


lim <5(B) = 

n-H oo 



l 



7 • Vp 1 ^, then 
(4.9) 
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This limit root Hermite factor for /I between 50 and 1000 is illustrated in table 4.1 and figure 4.4. 
Figure 4.4: Root Hermite factor for typical BKZ-/3 reduced basis (50 ^ ^ 1000) 



Table 4.1: Root Hermite factor for typical BKZ-/I reduced basis (50 ^ /3 ^ 1000) 


p 

50 

60 

70 

80 

90 

100 

110 

7(B) 

1.0121 

1.0115 

1.0108 

1.0103 

1.0097 

1.0093 

1.0088 

P 

120 

130 

140 

150 

160 

170 

180 

7(B) 

1.0084 

1.0081 

1.0078 

1.0075 

1.0072 

1.0067 

1.0065 

P 

190 

200 

210 

220 

230 

240 

250 

7(B) 

1.0065 

1.0063 

1.0061 

1.0059 

1.0058 

1.0056 

1.0055 

(3 

300 

400 

500 

600 

700 

800 

1000 

7(B) 

1.0048 

1.0040 

1.0034 

1.0030 

1.0027 

1.0024 

1.0020 


4.2 Simulate reduction procedure 

The worst-case analysis of running time is given by Hanrot et al [44], which also gives an asymptotic 
on the speed of convergence of the output quality of the basis. In contrast, this section is devoted to the 
analysis of reduction procedure on average. 

BKZ reduction is naturally separated into several rounds. The goal of our simulation algorithm is to 
predict the Gram-Schmidt sequence (||b*||, ||b£||,..., ||b*||) during the execution of BKZ, more precisely 
at the beginning of every round. One round of BKZ costs essentially n — 1 enumeration calls. 

We assume that the input basis is a "random" reduced basis, without special property. Once we can 
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4.2. Simulate reduction procedure 


Algorithm 20 SimulateBKZl 

Input: The Gram-Schmidt norms, given as £j = log( ||b* ||),for i = 1,... ,n, 
a blocksize /6 G {50,..., n}. 

Output: A prediction for the Gram-Schmidt norms f ■ = log( ||b* ||), i = \,, n, after one round of BKZ 
reduction. 


predict the value of Ai(L^j) for each local block, we know that this will be the new value of ||b* | by 
definition of the enumeration subroutine, with the only exception if the first vector is already shorter 
than ||b*|j. This could happen when, for example, the value of /I in simulation is not big enough, and 
the basis already had higher quality than can be provided by BKZ-/3. Knowing the value of new ||b* |j 
allows to deduce the volume of the next local block, and therefore iterate the process until the end of 
the round. This gives rise to our simulation algorithm (see Alg. 20). 


Algorithm 21 SimulateOneRoundBKZl 

Input: The Gram-Schmidt norms, given as £j = log(||b*||),for i = 1,... ,n, 
a blocksize j 6 G {50,..., n}. 

Output: A prediction for the Gram-Schmidt norms t\ = log(||b*||),z = 1 
1 : for k = 1 to n do 

2 : f <— min(k + /3, n) / /End index of local block 

3: log V <- EL U ~ Efci' 

4: £' k <— prediction of Ai(B^), based on log(V'). 

5: end for 


To start with, we make the assumptions that (6 ^ 50. This allows to approximate each block as 
random basis. 

We predict this first minimum Ai (L^]) as follows: 

• For most indexes j, we choose GH(B^q), unless ||b*|| was already better and 11 b * 11 ,..., ||bf r _ 1 || 
had not been changed in the current round. 

• However, for the last indices j, namely those inside the last 50-dimensional block T [„_ 49 „], we 
do something different: since this last block will be HKZ-reduced at the end of the round, we 
assume that it behaves like an HKZ-reduced basis of a random lattice of the same volume. To 
simplify, we determine the last 50 Gram-Schmidt norms from the average Gram-Schmidt norms 
of an HKZ-reduced basis of a random 50-dimensional lattice of unit volume: these average norms 
were computed experimentally and are displayed in Fig. 4.5. 

Here, if we use the prediction of Gaussian heuristic of Ai value for each block, we would need 
the assumption that the basis is not yet BKZ-/3 reduced. Because for each block, we suppose that its 
first vector is longer than the prediction by Gaussian heuristic and thus will always be updated by the 
enumeration. But this assumption is not always true. Especially, if the input basis is already much 
better reduced, this simulation will output a worse basis, while in actual reduction, this never happens. 

In practice, we already know the length of the first vector of each block b*. This may affect our 
prediction using Gaussian heuristic. For example, consider the case where ||b*|| is already shorter than 
the prediction given by Gaussian heuristic. Especially, we want to distinguish the case if a shorter vector 
than b* is found and the case if b* is unchanged. In the latter case, the rest of the basis is unchanged 
after this operation. Therefore we set our prediction of the shortest length to be min(||b*||, GH). If for 
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Figure 4.5: Typical 50-dimensional HKZ reduced basis with unitary volume 



0 10 20 30 40 50 


enumeration of blocks 1 to j — 1 , b|,.. ., b'_ 1 is not changed, then we know that b* is not modified 
since the beginning of the round, and if bj < GH, the enumeration will return b* as the shortest vector, 
keeping the rest of the basis still unchanged. We use the flag (p to mark if b\, ..., b' ^ is unchanged in 
the current round. When <p is true, the comparison between ||b*|| and GH is used to predict Ai of the 
new block and (p is modified only when ||b* |j < GH. The simulation is described in algorithm 22. 

4.3 BKZ with pruned enumeration 

An important improvement to BKZ comes from the improvement in enumeration. As presented in 
chapter 3, the enumeration process can be greatly sped up by using a pruning. Besides, instead of in¬ 
sisting on successfully finding Ai, in BKZ we are not obliged to succeed in finding b* = Ai (By r mm n ,j+p] )• 
As long as any vector is renewed in a round, the basis is improved and the next round will take place 
to further improve the basis. This is to say, we are free to choose p, the probability of success in enu¬ 
merating short vector of each block. Finally, we are interested in finding any vector that is shorter than 
current b*, which means that instead of setting the goal vector to be of length GH (By min( y , ), there 

can be a relaxing factor of 7 to the goal of enumeration 7 • GH( R\j tm m(j+p,n)])- 

This new BKZ algorithm takes parameters /l, p and 7 . It needs a slightly different enumeration 
procedure, because we need to call many pruned enumeration for one block. In addition, the simulation 
of this reduction is not easy, because the enumeration is not probabilistic. 

4.3.1 Probabilistic algorithm with parallel enumeration 

The enumeration can be conveniently described as a main process calling for many children enumer¬ 
ation process. Each child process randomizes the basis independently and does an enumeration with 
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Algorithm 22 Simulate0neRoundBKZ2 

Input: The Gram-Schmidt norms, given as 4 = log( ||b* ||),for i = 1,... ,n, 
a blocksize / 6 E {50,..., n}. 

Output: A prediction for the Gram-Schmidt norms t\ = log(||b*||),z = 1,...,«. 

1 : (fi,... ,f5o) <— average log(||b{||,..., ||b^Q||) of an HKZ-reduced random unit-volume 50-dim lattice 
2: tp i — true //flag to mark whether k kn has changed 
3: for k = 1 to n — 50 do 

4: f <— min(k + j6, n) / / End index of local block 

5: d <— / — k + 1 / / Dimension of local block 

6: 

7: if <^> = true then 

8: , f logR-logfa) <4then 

9 . £/ logV-log(pj) 

10 : (p <— false 

11 : else 

12 : £' k <- 4 

13: end if 

14: else 

15 . logR-log(grf) 

16: end if 

17: end for 

18: log V <- E"=l 4 - Efei 50 
19: for k = n — 49 to n do 

20: 4 + ^k+50—n 

21 : end for 


success probability po- The main process collects the output from children, and picks up the short¬ 
est vector among them. In fact, if a total of M randomized enumerations of success probability po 
is launched, the success probability if then p « po • M. The child process and the parent process is 
described in algorithm 23 and 24 respectively. 


Algorithm 23 ChildEnum 

Input: The block Bp*.], and the bounding function R. 

Output: b new = v • B^] the shortest vector in enumeration with bounding R. 
1: B' <— Randomize(B p*.]); 

2: B' <-LLL(B') 

3: b new <— Enumeration (B', R) 


4.3.2 Simulation 

The BKZ reduction with pruned enumeration differs from original BKZ as it introduces two more pa¬ 
rameters, p and 7. For each block, the first vector is only expected to be renewed with probability p, 
so we no longer have a deterministic expectation. This complicates the simulation problem. We are 
obliged to make some restrictions in order to give predictions. 
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Algorithm 24 0neRoundBKZ2 (Main process) 

Input: B the input basis, /i the blocksize, M the number of child process, and R the bounding function 
for enumeration. 

Output: B after one round of BKZ-(j8, p, 7 ) reduction, (p = M ■ p s UCC (B), 7 = R n ), 
and f marking if any enumeration had been successful during this round. 

1 : f unsuccessful; 

2 : for /' = 1 ,..., n — 1 do 
3: k min(; + /3 — 1, m) 

4: for i 1,..., M do 

5: (b^vW) ^-ChildEnum(Bp],R). 

6 : end for 

7: m arg min^/^M |j 

8 : if ||b^|| < ||b*|| then 

9: f<— successful; 

10 : B LLL(bi,..., b ; -_!, Yli=j bi,bj,...) 

11 : end if 

12 : end for 


An ideal simulation process in the most general form, will take as input the following input and 
output for each round the shape of the basis being reduced. It is described in algorithm 25. 


Algorithm 25 Ideal_SimulateOneRoundBKZ 

Input: The Gram-Schmidt norms, given as lj = log( ||b* ||),for i = 1,... ,n, 
a blocksize j8 £ {2 ,..., n}, a relax factor 7, success proba p. 

Output: A prediction for the Gram-Schmidt norms t\ = log( ||b* ||), i = 1,... ,n, after one round of BKZ 
reduction. 


In practice, by making following restrictions on BKZ parameters, we still have a deterministic sim¬ 
ulation algorithm. 

1. p ~ 1, then the reduction process can be predicted with a deterministic simulation process. 

2. /3 ^ 50, so that each block behave like random basis. 

With the restriction that p ~ 1, we are expected to renew the first vector of each block with a new vector 
of length 7 • GH(B^j). This is to say, we can substitute the quantity log J by 7 • log y ~j° g ^ in 

algorithm 22. 

One tempting possibility to include p into the simulation is to launch several Monte Carlo simu¬ 
lations. The average output and a variance can be computed from these simulations. However, this 
simulation does not fit much in the experiment data. The problem is that we only consider b* to be 
generated by enumeration, but in fact, it can be modified during LLL reduction when we find much 
shorter vector than b*. When enumeration shortens certain b* but not others, then swap between the 
vectors can happen during LLL reduction to re-order the vectors. We do not know how to incorporate 
the mechanism of swap and LLL into our simulation yet. 

We still have following open problems to solve in order to find an ideal simulation procedure. 

• Simulate efficiently while taking different p < 1 into consideration. 
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• Simulate BKZ with /3 < 50. It is reported in [28] that the blocks in BKZ differs considerably with 
random basis for small /3. Thus, we do not know how to predict an average BKZ-/! reduction for 
smaller j6. 

An interesting question would be to compare the efficiency of reduction with /3, p and 7 . For in¬ 
stances, with pruning option p and relax parameter 7 , we are able to do a weak enumeration in large 
blocksizes using the same time as a strong enumeration in small blocksizes. With a correct simulation 
we can find out which strategy will improve the basis the best. 

4.3.3 Simulation compared to experiment data 

We compared the experimental results with the prediction produced by simulation algorithm. Under 
the restriction of fS ^ 50 and p ~ 1, the simulation results are good approximation to the experiment 
data. We ran simulation both on random lattices and Darmstadt's lattice challenges. Figure 4.7 com¬ 
pares the sequence of ||b*|| produced by experimental data (in red) and prediction by simulation (in 
black). It turns out that the simulation matches well with the experimental data. 

Meanwhile, the prediction of //(B) during reduction procedure approximates experimental data as 
well. This prediction is an application of corollary 4.1.3. Figure 4.6 shows the evolution of />(B) during 
BKZ-90. The improvement of <5(B) is mostly done in the beginning few rounds, which complements 
the theoretical results of [44]. 


Prediction on evaluation of Hermite Factor 



Figure 4.6: Evolution and prediction of £(B) during BKZ-90 reduction in dim 180 for Darmstadt's 

lattice challenges 500-625 
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4. BKZ Reduction and Simulation 


Figure 4.7: Comparing simulation with experiment data. The figures show a zoom-in view of the first 

50 basis vectors 


Comparing simulation and execution of BKZ with blocksize 50 



Comparing simulation and execution of BKZ with blocksize 90 
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4.4 BKZ 2.0: improved reduction 


We implemented an improved version of BKZ, which incorporates several modifications to the original 
algorithm. Besides pruned enumeration, another important improvement is to pre-process each block 
before enumeration takes place. It is folklore that the enumeration time is dependent on the quality of 
the basis given, and a reduction of the basis will shorten the enumeration time. But this reduction is 
never performed on BKZ blocks before. Especially, it is important to find out the optimal parameter for 
the preprocessing-reduction: this parameter is dependent on the current quality of the block, and it is 
dependent on (5, p and 7 . 


4.4.1 Abort after a few rounds 


The most significant improvements of BKZ reduction only occurs in the first few rounds. This is both 
proved by [44] and predicted by our simulation algorithm. Hence, instead of launch the reduction 
round and round again, it is more efficient to stop the reduction after K rounds, knowing that following 
rounds are only going to make minor contributions to the improvement of the basis quality. This adds 
another parameter K to BKZ reduction. 


4.4.2 Preprocessing with a smaller reduction 


Without any preprocessing, the quality of the basis of block B ^ quickly becomes only LLL-reduced. 
This is because: 


• For each enumeration, the local basis is only guaranteed to be LLL-reduced, even though the 
whole basis may be more than LLL-reduced. As enumeration finds vectors and the basis get 
renewed, the rest of the basis is modified in the LLL reduction following this successful enumer¬ 
ation. 


• In high blocksizes, most enumerations are successful: they find a shorter vector than the first block 
vector. This implies that a local LLL-reduction will be performed to get a basis from a generating 
set: At the next iteration, the enumeration will proceed on a typical LLL-reduced basis, and not 
something likely to be better reduced. 
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Gram-Schmidt norm during BKZ reduction(fS=65,dim(L)=200) 



Figure 4.8: Quality of a local block in in original BKZ 


Observation from experiments confirm the intuition. Figure 4.8 shows the quality of the whole 
basis during a typical BKZ iteration; one clearly sees that before enumeration, the local basis is only 
LLL-reduced, while other local bases (outside the window formed by the block) are more reduced. 

Therefore, we apply to the block a BKZ-/i 0 which aborts after Q rounds. Obviously /Iq is much 
smaller than /i, otherwise the reduction cost is much greater than the original enumeration cost. The 
optimal j 8 o and Q are dependent on /3, 7 and p, and we searched for this parameter by comparing 
different /3 q and Q for the reduction performance and choose the best among all parameters. 


4.4.3 Recursive reduction 

Our preprocessing reduction during BKZ uses a blocksize of /3o, and when this /So gets big, one wonders 
if it is worth doing another preprocessing for each block of size ySo- 

In general, we can have a recursive BKZ reduction. BKZ-/So can make calls to BKZ-ft], which again 
calls BKZ-/I 2 , and so on. Until level L, when /i; gets small enough, so that each enumeration in blocksize 
f$L can be performed directly without any prior reduction. At level i of reduction, the set of parameter is 
/5j, pi, 7 „ Qi, and is fixed at the beginning of the reduction. The total number of parameter is multipled 
by L, the depth of reduction, in comparison to the non-recursive BKZ algorithm. 

An algorithmic description of this is given as follows (algorithm 27 and 26): 


— 104— 









4.4. BKZ 2.0: improved reduction 


Algorithm 26 RecEnumeration 

Input: B the input basis, 

j3 0/ P\, ■ ■ ■, Pl the blocksizes, 

M the number of child process, 

Ro, Ri,..., Rl the bounding functions for enumeration, 

Q o, Qi,..., Ql the number of rounds before abort, 
j, k the beginning and ending index of the block. 

Output: B after one round of BKZ-(j 8 o, p, 7 ) reduction, (p = M ■ p s UC c(Ro)/ 7 = R«), 
and f marking if any enumeration had been successful during this round. 

1 : Jo ~ j',K 0 X- k. 

2: l i — 1 

3: Jl Jo — 1 

4: while (< 7 i < Qi) do 

5: Ji<-((Ji-Ji -1 + 1) mod (X,_! -//_!))+//_! //next J 

6 : X/ = min( 7 / + jS* — 1 ,X/_ 1 ); //nextK 

7: while (Z < L) and (X; — Ji > 1 ) do 

8 : Z Z + 1; / /go down to small enums if current block is big 

9: Jl <-~ Jl-b 

10 : X; min(/; + /3 1 — l,X/ + i); 

11: qi <- 0; 

12 : end while 

13: (bnea,, v) ^Enumerate (B R,) 

14: B LLL(bi,.. -,b/,_i,5 Di=/,,...,jc z ^/b/vbj,,...) 

15: if // = X/_ 1 — 1 then 

16: C]j i — C]j + 1 

17: end if 

18: while (qi > Qi ) and (Z > 0) do 

19: l i-l-l 

20 : b nea , Enumerate(B[ j ( jq], R,-) 

21: B { LLL(bi,. . ., bj ; _i, bnewr bj ( ,. . . ) 

22 : B <-LLL(bi,..., b/,_i, Ei=j, . k, ^/b/, b /( ,...) 

23: if J] = K[_i — 1 then 

24: qj <- qi + 1 

25: end if 

26: end while 

27: end while 
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Algorithm 27 0neRoundBKZ3 (Recursive) 

Input: B the input basis, (/5 q, /1|,...,/)/.) the blocksizes, 

M the number of child process, 

Ro, Ri,..., Rl the bounding functions for enumeration, 

Qo, Qi, • • •, Ql the number of rounds before abort. 

Output: B after one round of BKZ-(j 8 o, p, 7 ) reduction, (p = M ■ p S ucc(Ro)/ 7 = Rn), 
and f marking if any enumeration had been successful during this round. 

1 : f unsuccessful; 

2: for /' = 1,..., n — 1 do 
3: k min(; + j 6 — 1 ,m) 

4: for i 1,..., M do 

5: (bmL,vW) RecEnumeration (B, j,k). 

6: end for 

7: m <- argmini <i<M ||b^Vw|] 

8: if ||bi" ! 2|| < ||b*|| then 

9: f ■ 4— successful; 

10 : B LLL(bi, ..., bj-i, E k i= j v- m) ■ b„ by,... ) 

11 : end if 

12 : end for 


4.4.4 Optimal parameters for BKZ 

In the recursive version of BKZ, the number of parameters are multiplied by the total level L. As a matter 
of fact, for a given /}, p, 7 , there is an optimal preprocessing parameter (/3o, po, Qo)/ (jSi, pi, Qi),.... 

In our experiment, we have actually fixed 7 to be ||b*||/GH(B^q) for simplicity. For each /% 
and po, we try to exhaustively try all possible preprocessing parameters, for /3] < ySo and p S 
{5%, 10%,..., 100%}, unless it takes too long to finish, and record the time it takes to reduce a LLL- 
reduced 100 dimensional basis for one round. Then the best time of reduction with preprocessing is 
again compared with enumeration without any preprocessing, to draw the final conclusion for the best 
parameter. 

This search tries to exhaust all possible parameters, but the experiment order is carefully chosen 
in order to avoid totally vain efforts and arrive at optimal parameters faster: For example, while we 
search for the best preprocessing parameter for ySo = 35, p 0 = 10%, if with /h = 20 the preprocessing 
the reduction is not faster than reduction without preprocessing, then there is no need to try for larger 
01 - 

In addition, there is a slight recursive pattern in the parameter. For example, with /3 = 25 and (5 = 30, 
our experiments show that it is always better not to apply any preprocessing during reduction. This 
means that when we use a preprocessing /h < 30, we would not need any recursive preprocessing with 
( 62 inside the block of size To be on the safe side, when /T| > 30, we did not rely completely on the 
recursive pattern. Even when (f$2,Pi) is the recommended preprocessing parameter for BKZ-/)], we 
always try to compare some parameters around @ 2 , pi and pick the best one, because the basis quality 
in BKZ blocks with preprocessing, can be better than LLL-reduced basis. 

Our implementation is based on the NTL library, and it is known that its LLL reduction is not opti¬ 
mal. For example, it is slower than the implementation in fplll library. In addition, the running time of 
the reduction has much too do with the computation precision. While we used double precision in our 
experiment, it is possible that for input lattices of larger dimension or special structure, the computation 
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precision needs to be increased. For both these reasons, we do not claim our recommended parameters 
to be optimal under all circumstances. It is probably close to optimal anyway. And the timing figures 
we provided are only used for finding out the reduction parameter, and should be interpreted with 
care. 


In the tables that follow, we listed our recommended parameter for /3o = 25,30,... ,90. For larger 
(6o, we do not recommend parameters for every po, because when /So > 50, large po are inefficient and 
rarely used in practice. Instead, it would make more sense to use larger block size /3o and smaller po in 
order to obtain the similar output reduction quality. 

The second line in the table shows the total time for the first round of reduction, while the third line 
shows the time spent on enumeration. 


From the figures, we see that the enumeration in dimension 25 and 30 is fast enough even without 
preprocessing. 


Table 4.2: Executing for /3 q = 25 


/So = 25 

5% 

10 % 

15% 

20 % 

25% 

30% 

35% 

40% 

45% 

50% 

Total time (s) 

0.16 

0.17 

0.17 

0.17 

0.18 

0.18 

0.18 

0.19 

0.20 

0.19 

Enum time (s) 

0.03 

0.04 

0.03 

0.03 

0.03 

0.03 

0.04 

0.03 

0.03 

0.03 

/So = 25 

55% 

60% 

65% 

70% 

75% 

80% 

85% 

90% 

95% 

100 % 

Total time (s) 

0.20 

0.19 

0.19 

0.20 

0.20 

0.20 

0.20 

0.21 

0.22 

0.26 

Enum time (s) 

0.03 

0.04 

0.03 

0.04 

0.04 

0.04 

0.04 

0.04 

0.05 

0.08 


Table 4.3: Executing for /3o = 30 


/So = 30 

5% 

10 % 

15% 

20 % 

25% 

30% 

35% 

40% 

45% 

50% 

Total time (s) 

0.21 

0.21 

0.22 

0.22 

0.23 

0.24 

0.24 

0.24 

0.25 

0.25 

Enum time (s) 

0.04 

0.04 

0.02 

0.04 

0.04 

0.04 

0.04 

0.04 

0.04 

0.04 

/So = 30 

55% 

60% 

65% 

70% 

75% 

80% 

85% 

90% 

95% 

100 % 

Total time (s) 

0.26 

0.26 

0.26 

0.28 

0.28 

0.29 

0.30 

0.30 

0.32 

0.66 

Enum time (s) 

0.04 

0.05 

0.05 

0.06 

0.06 

0.06 

0.07 

0.08 

0.08 

0.42 


For /3o = 35, if we use pruned enumeration, the preprocessing does not have any advantage either, 
except for full enumeration, where the preprocessing with chosen parameter is better than no prepro¬ 
cessing. For full enumeration in BKZ-35, we recommend a preprocessing of BKZ-30, p = 5% and round 

Q = l. 


— 107— 



4. BKZ Reduction and Simulation 


Table 4.4: Executing for /So = 35 


/So = 35 

5% 

10 % 

15% 

20 % 

25% 

30% 

35% 

40% 

45% 

50% 

Total time (s) 

0.11 

0.15 

0.16 

0.19 

0.20 

0.22 

0.24 

0.25 

0.26 

0.28 

Enum time (s) 

0.04 

0.04 

0.04 

0.04 

0.05 

0.05 

0.06 

0.06 

0.07 

0.08 

/So = 35 

55% 

60% 

65% 

70% 

75% 

80% 

85% 

90% 

95% 

100 % 

Optimal jSo 










30 

Probability 










5% 

Rounds 










1 

Total time (s) 

0.29 

0.31 

0.33 

0.34 

0.37 

0.41 

0.44 

0.50 

0.58 

2.63 

Enum time (s) 

0.08 

0.08 

0.10 

0.11 

0.14 

0.16 

0.19 

0.26 

0.32 

1.35 


Table 4.5: Executing for /So = 40 


/So =40 

5% 

10 % 

15% 

20 % 

25% 

30% 

35% 

40% 

45% 

50% 

Total time (s) 

0.15 

0.20 

0.22 

0.28 

0.32 

0.35 

0.39 

0.42 

0.47 

0.55 

Enum time (s) 

0.04 

0.06 

0.07 

0.07 

0.1 

0.13 

0.15 

0.18 

0.22 

0.27 

~05 

o 

II 

4^ 

O 

55% 

60% 

65% 

70% 

75% 

80% 

85% 

90% 

95% 

100 % 

Optimal jSo 









25 

35 

Probability 









30% 

75% 

Rounds 









1 

1 

Total time (s) 

0.58 

0.63 

0.80 

0.83 

0.98 

1.17 

1.56 

1.93 

2.21 

11.22 

Enum time (s) 

0.30 

0.33 

0.51 

0.54 

0.68 

0.86 

1.25 

1.61 

0.61 

9.05 


Table 4.6: Executing for /% = 45 


/S 0 =45 

5% 

10 % 

15% 

20 % 

25% 

30% 

35% 

40% 

45% 

50% 

Optimal (Si 










25 

Probability 










30% 

Rounds 










1 

Total time (s) 

0.23 

0.38 

0.49 

0.60 

0.91 

1.06 

1.17 

1.49 

1.65 

2.12 

Enum time (s) 

0.10 

0.19 

0.29 

0.37 

0.65 

0.77 

0.87 

1.18 

1.33 

0.43 

/S 0 =45 

55% 

60% 

65% 

70% 

75% 

80% 

85% 

90% 

95% 

100 % 

Optimal (Si 

25 

25 

25 

25 

25 

25 

25 

40 

40 

40 

Probability 

10 % 

20 % 

5% 

20 % 

45% 

65% 

60% 

10 % 

20 % 

70% 

Rounds 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

Total time (s) 

2.21 

2.32 

2.47 

2.62 

2.84 

3.07 

3.39 

3.91 

5.11 

77.55 

Enum time (s) 

0.52 

0.55 

0.77 

0.88 

0.95 

1.09 

1.42 

1.75 

2.77 

74.72 


Starting from BKZ-50, preprocessing is obviously improving the performance of BKZ. 
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Table 4.7: Executing for /So = 50 


/So = 50 

5% 

10 % 

15% 

20 % 

25% 

30% 

35% 

40% 

45% 

50% 

Optimal jSj 


25 

25 

25 

25 

25 

25 

25 

25 

35 

Probability 


5% 

15% 

15% 

30% 

30% 

55% 

60% 

55% 

30% 

Rounds 


1 

1 

1 

1 

1 

1 

1 

1 

1 

Total time (s) 

0.95 

1.93 

2.15 

2.30 

2.58 

2.86 

3.11 

3.34 

3.73 

4.10 

Enum time (s) 

0.77 

0.30 

0.39 

0.53 

0.71 

0.92 

1.03 

1.21 

1.59 

1.53 


)6 o = 50 

55% 

60% 

65% 

70% 

75% 

80% 

85% 

90% 

95% 

100 % 

Optimal /Si 

45 

45 

45 

45 

45 

45 

40 

45 

40 


Probability 

5% 

10 % 

15% 

10 % 

15% 

15% 

30% 

20 % 

50% 


Rounds 

1 

1 

1 

1 

1 

1 

1 

1 

1 


Total time (s) 

4.49 

4.92 

5.37 

5.57 

6.44 

7.38 

8.91 

10.46 

14.10 


Enum time (s) 

2.21 

1.94 

2.49 

2.77 

3.57 

4.44 

5.76 

7.41 

10.70 



Table 4.8: Executing for /So = 55 


/So = 55 

5% 

10 % 

15% 

20 % 

25% 

30% 

35% 

40% 

45% 

50% 

Optimal j8i 

25 

25 

25 

25 

35 

40 

45 

45 

45 

45 

Probability 

10 % 

55% 

55% 

65% 

20 % 

15% 

10 % 

10 % 

15% 

20 % 

Rounds 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

Total time (s) 

0.37 

0.83 

1.13 

1.70 

2.00 

2.37 

3.29 

3.54 

4.65 

5.31 

Enum time (s) 

2.09 

2.90 

3.26 

4.00 

4.82 

5.55 

6.63 

6.98 

8.21 

8.97 


= 55 

55% 

60% 

65% 

70% 

75% 

80% 

85% 

90% 

95% 

100 % 

Optimal jSi 

45 

45 

45 

45 

45 

45 

40 

45 

40 


Probability 

20 % 

20 % 

25% 

30% 

35% 

45% 

45% 

45% 

45% 


Rounds 

1 

1 

1 

1 

1 

1 

1 

1 

1 


Total time (s) 

7.26 

7.85 

10.45 

13.60 

15.94 

21.76 

25.91 

42.51 

61.61 


Enum time (s) 

10.91 

11.58 

14.33 

17.57 

20.02 

25.95 

30.16 

46.77 

69.02 
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Table 4.9: Executing for /So = 60 


i 6 0 = 60 

5% 

10 % 

15% 

20 % 

25% 

30% 

35% 

40% 

45% 

50% 

Optimal (Si 

35 

35 

40 

45 

45 

45 

45 

45 

45 

45 

Probability 

15% 

25% 

25% 

15% 

15% 

20 % 

25% 

20 % 

40% H 

35% 

Rounds 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

Total time (s) 

1.36 

2.65 

4.05 

5.35 

7.99 

10.02 

11.84 

17.39 

13.32 

16.88 

Enum time (s) 

4.07 

5.79 

7.93 

9.49 

12.06 

14.36 

16.39 

21.78 

16.52 

20.04 

O 

V£> 

0 

ca 

55% 

60% 

65% 

70% 

75% 

80% 

85% 

90% 

95% 

100 % 

Optimal (Si 

45 

45 

50 

50 

50 

50 

50 

50 



Probability 

35% 

35% 

25% 

30% 

30% 

25% 

40% 

50% 



Rounds 

1 

1 

1 

1 

1 

1 

1 

1 



Total time (s) 

22.11 

28.32 

37.19 

41.71 

62.07 

68.76 

88.18 

178.88 



Enum time (s) 

25.24 

31.50 

40.33 

44.98 

65.30 

71.95 

91.53 

182.37 




In the table, from 40%(*) on, we switched the program to a machine which is 1.6 times faster. 


For larger /3o, some of the recommended parameter has recursion level L > 1, as was the case for 
(So = 65, po ^ 50%. In this case, we represented the parameters by simply listing (( 62 , pi, Qi, /Si, pi, Qi). 


Table 4.10: Executing for jSq = 65 


iS 0 = 65 

5% 

10 % 

15% 

20 % 

25% 

30% 

35% 

40% 

45% 

Optimal /So 

45 

45 

50 

50 

50 

50 

50 

50 

50 

Probability 

15% 

15% 

10 % 

15% 

20 % 

15% 

20 % 

25% 

25% 

Rounds 

1 

1 

1 

1 

1 

1 

1 

1 

1 

Total time (s) 

3.24 

6.96 

12.91 

18.56 

28.95 

37.31 

44.02 

60.50 

76.75 

Enum time (s) 

6.07 

9.96 

16.12 

21.98 

32.48 

40.73 

47.60 

64.22 

80.44 


lS 0 = 65 

Optimal preprocessing parameter 

Total time (s) 

50% 

25 0.1 1 50 0.3 1 

97.47 

55% 

25 0.05 1 55 0.2 1 

116.85 

60% 

25 0.05 1 55 0.25 1 

136.13 

65% 

25 0.05 1 55 0.25 1 

180.93 

70% 

25 0.05 1 55 0.3 1 

231.70 

75% 

25 0.05 1 55 0.3 1 

243.75 

80% 

25 0.05 1 55 0.3 1 

387.41 
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Table 4.11: Executing for ySo = 70 


|6o = 70 

Optimal preprocessing parameter 

Total time (s) 

5% 

50 0.05 1 

85.35 

10 % 

25 0.05 1 55 0.05 1 

140.75 

15% 

25 0.05 1 55 0.05 1 

191.15 

20 % 

25 0.05 1 55 0.15 1 

265.00 


Table 4.12: Executing for ySo = 75 


]6o = 75 

Optimal preprocessing parameter 

Total time (s) 

5% 

35 0.15 1 60 0.05 1 

204.85 

10 % 

25 0.15155 0.151 

464.9 

15% 

25 0.05 1 55 0.15 1 

808.35 

20 % 

25 0.05 1 55 0.15 1 

1310.25 


Table 4.13: Executing for ySo = 80 


"Da 

O 

II 

00 

o 

Optimal preprocessing parameter 

Total time (s) 

5% 

40 0.15 2 60 0.05 1 

879.35 

10 % 

35 0.15 1 60 0.05 1 

3715.15 


Table 4.14: Executing for /3q = 90 


j6o = 90 

Optimal preprocessing parameter 

Total time (s) 

0.5% 

50 0.15 2 65 0.2 2 

2443.35 


For larger blocksizes, the enumeration can be predicted with Gaussian heuristic, and its performance 
can be predicted by simulation. Please refer to the next chapter for discussions about optimal parameter 
for enumerations. 
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Chapter 5 


New Reduction Procedure 
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The simulation of BKZ reduction allows to predict the running time and output quality of each 
round. Hence, given an input basis, we are able to compare among possible /3 and other parameters, 
and decide for the best parameter for reduction. This is again an optimization problem. In the first time, 
we start by only looking at the different value of /I 

The output basis after this round of reduction has improved quality than the input basis, therefore 
for the new round, we will conclude different parameters for reduction, and probably the optimal j 8 of 
the next round is different from previous round. This is gives a different reduction algorithm than the 
previous BKZ reduction, where we fix j 6 in advance. Instead, p can be automatically chosen at each 
round, depending on the current basis. It is adjusted (increased) each round until the basis is reduced 
to target quality. This process can be fully predicted using the simulation procedure, and the resulting 
basis sequence is what we call an optimal reduction sequence. It is called optimal because each round, 
the chosen /3 guarantees predicted by simulation. 

The optimal reduction sequence also arguably answers the question of optimal enumeration. In 
chapter 3, we discussed how to choose a pruning function with given basis, but we do not know how 
to choose a good reduction in order to minimize running time. Now the optimal reduction sequence 
provides an answer for this. It suffices to compute for each basis in this sequence, the best possible 
enumeration time, and adding this to the reduction time. In most of the cases, this gives an almost 
optimal answer. 

Following the same vain, it is natural to ask whether it is possible to automatically choose a good 
preprocessing strategy as well. It turns out that this is more complicated than The main idea is similar 
to that of optimal enumeration, since the goal of preprocessing is to prepare a block of basis for efficient 
enumeration as well. However, the goal of this enumeration procedure in BKZ is implicit. In BKZ 
enumeration, we are actually solving ASVP 7 with some probability po, where 7 and p is not clear. 
Contrary to the common case of standalone SVP, where we are given an approximation factor, and we 
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aim to solve it with 100% confidence with 1/p number of randomized enumerations. Here we are not 
hoping to be all-time successful for BKZ enumerations, and we do not have an explicit demand for the 
approximate factor 7 . This is to say, our optimization goal is unclear. While this problem is still open, 
we provide some sub-optimal ways set up the preprocessing parameters automatically. 

In practice, the new BKZ reduction algorithm allow us to solve the SVP challenge up to dimension 
126. 

5.1 Optimal reduction sequence 

In BKZ reduction, we are able to predict for each round the output basis quality. This gives us the choice 
to use different reduction parameters for different round, depending on the character of the existing 
basis. In fact, we are able to compare for different parameters, the efficiency of reduction, thus choose 
the optimal reduction parameter for the basis given. And once we optimize the reduction parameter for 
each round iteratively, we obtain the optimal sequence of basis by using different parameters for each 
round respectively. 


Algorithm 28 Block Korkin-Zolotarev (BKZ) algorithm 

Input: A basis B = (bi,...,b„),a target Hermite factor % or target half volume //q. 

Output: The basis (bi,... ,b„) is BKZ-/3 reduced 
1: (B,U) LLL(bi,.. .,b„,U );//LLL-reduce the basis, and obtain Gram-Schmidt Orthogonalisation U 
2 : while £(B) < So or //(B) < //o do 
3: (j B,r,T) BestReductionParameter(B) 

4: (B,U) OneRoundBKZ^^Tq (B,U) 

5: end while 


The BestReductionParameter procedure will give the optimal parameter (/3,r, T) which most 
efficiently reduces the given basis B. This is done by solving an optimization problem again. Now the 
OneRoundBKZ is described as follows, 

BKZ-/3 algorithm makes calls to enumerations of dimension at most /I 

5.1.1 The optimization problem 

We turn the problem of parameter choosing into an optimization problem thanks to simulation. By 
trying different parameters, and comparing their running time and output basis quality, we can choose 
the best among them for the next round of reduction. This again, can be view as an optimization 
problem, where the variable to be optimized is the reduction parameters, and the optimization goal is 
the quality of basis with regard to running time. 

The parameters to be optimized include: j 6 , 7 , and p. ft is the size of the enumeration block, 7 and p 
the probability of are parameters for enumeration on the block B . The enumeration is supposed to 
find a short vector of each block within radius 7 • GH with total probability p. 

We still need to well define an optimization goal which integrates both basis quality and running 
time. The quality of the basis can be described by several values, including half volume //(B) and root 
Hermite factor <5(B). The running time T can be roughly estimated by adding enumeration time of each 
block. This gives us several choices of optimization goal functions to minimize. We note B and B' the 
basis before and after reduction respectively, then possible choices of goal functions include 

. /(/5, 7 , p) = (z/(B') — //(B))/T 
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. f(fl, 7 ,p) = (6(B')-6(B))/T 

• /( J 6 / 7 ,p) = ('/(B , )-?/(B))/logT 

• /(j 8 , 7 ,p) = (*(B')-W)/logT 

The difference between these functions are subtle: The Hermite factor £(B) describes the length of 
|| bi || with regard to the whole basis, whereas the half volume 77 (B) dominates the cost of enumeration 
of the basis. If the reduction purpose is to prepare the basis for enumeration, then it would be logic to 
use 77 (B), otherwise if one only wants a shorter basis, especially if one needs a short vector, the natural 
choice is 77 (B). 

Setting T as denominator is intuitive, as we want to improve as much quality as possible in unit 
time. As a matter of fact, this choice may favor very small j5 with minor improvements, which in¬ 
creases the number of rounds to reach the target quality. One can use log T to encourage larger steps of 
improvement. 

The optimization problem can be formalized as: 

minimize / ( 7 , p) 

P™ (5.1) 

subject to 7 > 1,2 ^ /3 ^ ? 7,0 < p < 1 

5.1.2 Estimating reduction running time 

In high dimension, enumeration time is usually dominant compared to the time spent on computing 
GS and LLL reduction. Thus our estimation of the running time is 

k=min(j+f5,n) 

7bkz(B,/I, 7 , p) = £ Tenum (B[yq, 7/ p) + 0( 1), (5.2) 

j=l,...,n —1 

We have discussed the estimation of running time for enumeration in chapter 3, when we search for 
a vector within 7 • GH on basis B. Here there is one more variable p, because we no longer expect to 
successfully find a short vector all the time. The estimated running time is simply 

Tenum (B[y ; fc], 7 , p) = p • T e;H(m (B|yq, 7 ) (5-3) 

In fact, estimation of enumeration is not definitive yet, because it will depend on the analysis of 
reduction. We will discuss how this optimal reduction sequence help to improve the enumeration 
performance in section 5.2. If we try to optimize every thing at the same time, it would be tremendously 
complicated. Instead in the first time, we use the enumeration. 

Ultimately we wish to create a look-up table for optimal enumeration time. For a typically re¬ 
duced basis B, its dimension n and half volume 77 (B) is enough to describe its relative character for 
estimating enumeration time, and searching for a short vector within approximate factor 7 will take 
Tenum («, 77 (B), 7 ) time. The base cases are established by record data by experiments, such as data from 
chapter 4. The construction of this table is discussed in the subsection 5.2.1 Once we have this look-up 
table, we can do the following estimation: 

Tenum(^[j,k]f 7 ) = T enum (k j T- 1, 77 (Bjy q), 7 ) (5.4) 
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5.1.3 Optimize /3 

For simplicity, we first consider a single parameter f6, and supposing 7 ~ 1, and p ~ 1. The algorithm 
of searching for optimal (5 is described in algorithm 29. 


Algorithm 29 SearchOptimaljS 

Input: The log Gram-Schmidt norm vectors of the basis to be reduced B* = (b j, ...,b*) 
and the expected running time of the round Tq. 

Output: The suggested optimal reduction blocksize /3 
1 : jS' -<— /So; / /ySo is an initial value, which can be set to 20 for example. 

2: B* SimulateOneRoundBKZ (B*,/ 5 ') 

3: (;;(B*)-//(B*))/T BJC z(B*,^) 

4: while p < n do 
5: p <- p + 1; 

6: B* SimulateOneRoundBKZ ( B*, yS 7 ) 

7: s' <- (j/(B*) - ?/(B*))/Tb KZ (B*,)S) 

8: if s' < s then 

9: S 4— s' 

10: else 

11 : t- j8'; return; 

12 : end if 

13: end while 


5.1.4 Adding other parameters 

With multiple parameters, it is hard to see how to optimize several parameters except for exhaustive 
search in the whole parameter space. The exhaustive search algorithm for multiple parameters is similar 
to the single parameter version, besides there will be one more loop in search for an optimal 7 . We do 
not know how to add p to the search yet, due to the restriction of our simulation algorithm. But in 
theory, once we have a good simulation which accepts p, adding p into optimization parameter is just 
adding yet another loop within the loop for 7 . 

However, more assumptions to the reduction will simplify the problem. For example, if we restrict 
the time of reduction for each round to be To, then on average, we will have about To /n time for each 
block enumeration. Suppose the block size is jB, then the constraint Tbkz{ B *, /3, 7 ) = To/n implies a 
unique value of 7 . This value can be computed using binary search, since Tbkz is monotonous increas¬ 
ing with respect to 7 . This yields the algorithm 30. 


5.1.5 Optimal sequence 

For each round we are able to compute a set of optimal parameters. Doing this optimal reduction 
iteratively yields an optimal sequence of basis, which can be viewed as the fastest possible path a basis 
can be reduced. We define the optimal sequence by following algorithm. 
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Algorithm 30 Search 0 ptimal/lAnd 7 

Input: The log Gram-Schmidt norm vectors of the basis to be reduced B* = (b^,..., b*). 
The expected time for a reduction round To- 
Output: The suggested optimal reduction blocksize ft and factor 7. 

1: ft' A- fto) / / fto is an initial value, which can be set to 20 for example. 

2: 7 A- Solve (Tgxz(bp'r 7) = Tq) //Binary search on 7 
3: B* SimulateOneRoundBKZ ( B*, yS', 7 ) 

4: S <- (J/(B*) - tj(B*))/TBKz(V*,ft) 

5 : while ft' < n do 
6 : ft' ^ ft'+ 1 ; 

7 : 7 Solve (Tbkz( B ^, 7) = To)//Binary search on 7 

8 : B* SimulateOneRoundBKZ (B*,ft',y) 

9: s'A- (J 7 (fi*) — I/(B*))/T bjcz (B*,^) 

10: if s' < s then 

11: S 4— s' 

12 : else 

13: ft A- /S'; return; 

14: end if 

15: end while 


Algorithm 31 OptimalSequence 

Input: The log Gram-Schmidt norm vectors of the basis to be reduced: B* = (||b*||,..., ||b*||). 
Output: A sequence of log Gram-Schmidt norm vectors B^, ... produced by optimal reduction, 

and the accumulated reduction time T\, T 2 , ■ ■ ■ ■ 

1: B* <- B*; Ti <- 0 
2 : for i = 2,3,... do 

3: (ft, [ 7 ]) SearchOptimaljS [And 7 ] (£;_ 1 ); 

4: B f ar SimulateOneRoundBKZ ( Bf“[, ft [, 7 ]); 

5 : Tf^-T BJCZ (B*,/ 8 [ / 7 ]) + T f _ 1 

6 : end for 


5.2 Optimal enumeration 


5.2.1 Putting together reduction and enumeration 


In chapter 3, we discussed how to choose a pruning function with given basis, assuming that the basis 
is reduced within a constant time. However, we did not discuss how to choose a good reduction. A 
natural solution is to select a reduced basis from the optimal reduction sequence. The algorithm to 
optimize enumeration by choosing proper reduction is described in algorithm 32. 
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Algorithm 32 OptimalEnum 


Input: The log Gram-Schmidt norm vectors of the basis to be reduced: l = {log(||bi ||),... ,log(||b„ ||)}, 
The approximation factor 7 . 

Output: Best enumeration parameter Bi,..., B„ 

1: [(Bj, B*,...), (Ti, T 2 , ■ ■ ■)] OptimalSequence(B*); 
t <- + 00 ; 
for i = 2,3 ,. 
t' min Re]R 
if t'<t then 

t <r~ t' 

else 


9 

10 


do 

Tem. 


Psucc(R) 


return (£j, Tj 

end if 
end for 


We searched for the optimal enumeration strategy for dimension < 250, assuming LLL-reduced 
quality for all lattices in the beginning. Due to the limit of our simulation procedure, we limit our 
computation for BKZ-/3 with /3 ^ 50. 

In the following table, we present the optimal reduction for dimension 70 ^ /3 ^ 250. The last 
column shows the number of operations in log 2 . Today on a typical CPU one can perform ~ 10 7 
operations per second. 


Table 5.1: Cost of enumeration 


p 

P 

Optimal preprocessing parameter 

Operations in log 2 

70 

5% 

j6i = 50, pi = 0.05, Qi = 1 

28.86 

75 

5% 

h = 35, p 2 = 0.15, Q 2 = 1; jSi = 60, pi = 0.05, Qi = 1 

30.35 

80 

5% 

P 2 = 40, p 2 = 0.15, Q 2 = 2 ; j6i = 60, pi = 0.05, Qi = 1 

32.71 

90 

2 % 

= 50, p 2 = 0.15, Q 2 = 2; j6i = 65, pi = 0.2, Qi = 2 

35.17 


For dimension ^ 100, we simplied our search by only considering BKZ-/1 and p = 1. So this can 
only be considered as an conservative estimate. A sequence of /3 is given, where each (1 is following 
by a number in paranthesis, which indicates the number of rounds of BKZ-/1 reduction to perform. For 
example, given a LLL-reduced random basis of dimension 100, it is recommended to do 5 rounds of 
BKZ-50, followed by two rounds of BKZ-60, before performing an extreme pruning with 0.13% proba¬ 
bility for each enumeration process. 
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Table 5.2: Cost of enumeration 


n 

P 

Optimal preprocessing parameter 

Operations in log 2 

100 

0.13% 

50(5) 60(2) 

39.21 

110 

6.8 x 10“ 6 

50(5) 60(3) 70(2) 80(1) 

44.23 

120 

6.3 x 10~ 7 

50(6) 60(4) 70(3) 80(2) 90(1) 

49.00 

130 

2.6 x 10“ 9 

50(7) 60(4) 70(3) 80(2) 90(2) 100(1) 

54.21 

140 

1.2 x 10- 5 

50(8) 60(5) 70(3) 80(2) 90(2) 100(1) 

59.91 

150 

6 x 10- 6 

50(9) 60(5) 70(4) 80(2) 90(3) 100(2) 110(1) 

65.77 

160 

2.5 x 10~ 6 

50(10) 60(6) 70(4) 80(3) 90(3) 100(3) 110(2) 120(1) 

71.74 

170 

1.3 x 10~ 6 

50(11) 60(7) 70(4) 80(3) 90(3) 100(3) 110(2) 120(2) 130(1) 

77.89 

180 

7.8 x 10“ 7 

50(12) 60(7) 70(5) 80(3) 90(3) 100(3) 110(3) 120(2) 130(2) 140(1) 

84.31 

190 

3.4 x 10~ 10 

50(14) 60(8) 70(5) 80(3) 90(4) 100(4) 110(3) 120(3) 130(3) 140(1) 

95.76 

200 

1.5 x 10~ 7 

50(15) 60(9) 70(6) 80(3) 90(4) 100(4) 110(3) 120(3) 130(3) 140(2) 

150(2) 160(2) 

99.32 

210 

1.7 x 10- 7 

50(16) 60(10) 70(6) 80(4) 90(4) 100(4) 110(4) 120(3) 130(3) 140(3) 
150(2) 160(2) 170(1) 

104.45 

220 

1.2 x 10~ 7 

50(17) 60(11) 70(7) 80(4) 90(5) 100(5) 110(4) 120(3) 130(3) 140(3) 
150(3) 160(2) 170(2) 180(1) 

111.47 

230 

7.4 x 10“ 12 

50(19) 60(11) 70(7) 80(4) 90(5) 100(5) 110(4) 120(4) 130(3) 140(3) 
150(3) 160(3) 170(1) 

119.90 

240 

1.8 x 10“ 7 

50(20) 60(12) 70(8) 80(4) 90(6) 100(5) 110(4) 120(4) 130(4) 140(3) 
150(3) 160(3) 170(3) 180(3) 190(1) 200(2) 

126.92 

250 

2.5 x 10“ 8 

50(22) 60(13) 70(8) 80(5) 90(6) 100(6) 110(5) 120(4) 130(4) 140(4) 
150(3) 160(3) 170(3) 180(4) 190(2) 200(2) 210(1) 

133.55 


The estimation of enumeration time depends on the optimal reduction sequence, while the estima¬ 
tion of reduction complexity depends on estimates of enumeration time. These two problems should 
be considered together. 

In fact, the enumeration of dimension n will only make calls to reduction of block size < n, and 
similarly the enumeration of block size /i will only make calls to enumeration with dimension C /T 
Therefore, we can regard this problem as a dynamic programing problem, whereas the enumeration 
cost table can be gradually filled in, from low dimension to higher dimension. 


5.3 Experiments and challenges 

5.3.1 Darmstadt SVP challenge 

The Darmstadt SVP challenge provide a platform to test our algorithm of optimal enumeration. It is 
enough to find a shorter vector than 1.05 • GH(B) for each dimension. We apply reductions to the 
lattices with increasing sequence of f>. Finally an enumeration is applied on this basis. 

For example for dimension 124, we consequtively applied BKZ-60, BKZ-80, BKZ-102 and BKZ-118 
to its basis, before finally performing a pruned enumeration on the reduced basis. 

The challenge provide multiple lattice basis for the same dimension. Therefore, instead of random¬ 
izing the basis and redoing the reduction and enumeration, as the algorithm assumption goes, we apply 
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the same procedure on different basis. The success probability for a single enumeration is about 0.02 in 
dimension 124, hence we launched reduction for about 54 instances simutaneously. 

Following list shows a few examples of SVP challenges we solved: 

Table 5.3: New solutions for Darmstadt's SVP challenges [54] 


Dim(Lattice) 

Solution Norm 

Previous norm 

||b 1 ||/GH(B) 

126 

2969 

Unsolved 

1.0436 

124 

2884 

2936 

1.0217 

124 

2936 

Unsolved 

1.0427 

122 

2913 

Unsolved 

1.0444 


5.3.2 Darmstadt lattice challenge 


Table 5.4: New Solutions for Darmstadt's lattice challenge [54] 


Dim(lattice) 

Dim(sublattice) 

New norm 

Previous norm 

Ratio 

Hermite factor 

800 

230 

120.054 

Unsolved 


1.00978 230 

775 

230 

112.539 

Unsolved 


1.00994 230 

750 

220 

95.995 

Unsolved 


1.0976 220 

725 

210 

85.726 

100.90 

0.85 

1.00978 210 

700 

200 

78.537 

86.02 

0.91 

1.00993 200 

675 

190 

72.243 

74.78 

0.97 

1.00997 190 

650 

190 

61.935 

66.72 

0.93 

1.00993 190 

625 

180 

53.953 

59.41 

0.91 

1.00987 180 

600 

180 

45.420 

52.01 

0.87 

1.00976 180 

575 

180 

39.153 

42.71 

0.92 

1.00977 180 

550 

180 

32.481 

38.29 

0.85 

1.00955 180 

525 

180 

29.866 

30.74 

0.97 

1.00990 180 


Darmstadt's lattice challenge is based on Ajtai's construction of hard lattice instances. For each 
dimension, the challenge is to find a vector of norm < q in an Ajtai lattice [4], where q depends on the 
dimension; and try to minimize the norm. Before year 2010, the highest challenge solved was 725: the 
first solutions to all challenges in dimension 575 to 725 were found by Gama and Nguyen in 2008, using 
NTL's implementation of BKZ with SH pruning. All solutions were found by reducing appropriate 
sublattices of much smaller dimension (typically around 150-200), whose existence follows from the 
structure of Ajtai lattices: we followed the same strategy. 

We used an relaxed enumeration on reduced basis to find the first ever solution up to dimension 
800, and significantly shorter vectors in all challenges 525 to 725, as summarized in Table 5.4: the first 
column is the dimension of the challenge, the second one is the dimension of the sublattice we used to 
find the solution, the third one is the best norm found by BKZ 2.0, the fourth one is the previous best 
norm found by former algorithms, the fifth one is the ratio between norms, and the sixth one is the 
Hermite factor of the reduced basis of the sublattice, which turns out to be slightly below 1.01* m . The 
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Table 5.5: Number of BKZ rounds required to break NTRUSign-157, as predicted by Alg. 20, starting 

from a BKZ-20 reduced basis 


Blocksize ft 

106 

110 

115 

120 

Number of rounds 

13 

6 

5 

4 


factor 1.01 dim was considered to be the state-of-the-art limit in 2008 by Gama and Nguyen [28], which 
shows the improvement. 

5.4 Revising security estimate of NTRU lattices 

In the NTRU cryptosystem [47], recovering the secret key from the public key amounts to finding a 
shortest vector in high-dimensional lattices of special structure. Because NTRU security estimates are 
based on benchmarks with BKZ, it is interesting to see the limits of this methodology. 

In the original article [47], the smallest parameter set NTRU-107 corresponds to lattices of dimension 
214, and it was estimated that key recovery would cost at least 2 50 elementary operations. The best 
experimental result to recover the secret key for NTRU-107 by direct lattice reduction (without ad-hoc 
techniques like [28,58,60] which exploit the special structure of NTRU lattices) is due to May in 1999 [58], 
who reported one successful experiment using BKZ with SH pruning [88], after 663 hours on a 200-MHz 
processor, that is 2 48 76 clock cycles. We performed experiments with BKZ 2.0 on 10 random NTRU-107 
lattices: We applied LLL and BKZ-20, which takes a few minutes at most; We applied BKZ -65 with 
5%-pruning, and checked every 5 minutes if the first basis vector was the shortest vector corresponding 
to the secret key, in which case we aborted. BKZ 2.0 was successful for each lattice, and the aborted 
BKZ-65 reduction took less than 2000s on the average, on a 2.83Mhz single core. So the overall running 
time is less than 40 minutes, that is 2 42 - 62 clock cycles, which gives a speedup of at least 70, compared 
to May's experiment, and is significantly lower than 2 50 elementary operations. Hence, there is an 
order of magnitude between the initial security estimate of 2 50 and the actual security level, which is 
approximately at most 40-bit. 

NTRUSign 

Now, we revisit recent parameters for NTRUSign. In the recent article by Hoffstein et al. [46], a summary 
of the latest parameters for NTRU encryption and signature is given. In particular, the smallest param¬ 
eter for NTRUsign is (N,q) = (157,256), which is claimed to provide 80-bit security against all attacks 
knowns, and 93-bit security against key-recovery lattice attacks. Similarly to [28], we estimate that find¬ 
ing the secret key is essentially as hard as recovering a vector of norm < q in a lattice of dimension 
2 N = 314 and volume c] N , which corresponds to a Hermite factor of 1.00886 2 V . We ran our simulation 
algorithm for these parameters to guess how many rounds would be required, depending on the block- 
size, starting from a BKZ-20 reduced basis (whose cost is negligible here): the results are summarized 
in Table 5.5. We deduce that six rounds of BKZ-110 should be sufficient to break NTRUSign-157, which 
corresponds to roughly 2 11 enumerations. And according to Table 5.2, extreme pruning enumeration in 
blocksize 110 can be done by searching through at most 2 47 nodes, which corresponds to roughly 2 54 
clock cycles on a typical processor. This suggests that the security level of the smallest NTRUSign pa¬ 
rameter against state-of-the-art lattice attacks is at most 65-bit, rather than 93-bit, which is a significant 
g a P- 
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Chapter 6 


Approximate GCD Algorithm 
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6.1 Introduction 

Following Gentry's breakthrough work [31], there is currently great interest on fully-homomorphic 
encryption (FHE), which allows to compute arbitrary functions on encrypted data. Among the few FHE 
schemes known [15,23,31,33,97], the simplest one is arguably the one of van Dijk, Gentry, Halevi and 
Vaikuntanathan [97] (vDGHV), published at EUROCRYPT '10. The security of the vDGHV scheme is 
based on the hardness of approximate integer common divisors problems introduced in 2001 by Howgrave- 
Graham [48]. In the general version of this problem (GACD), the goal is to recover a secret number p 
(typically a large prime number), given polynomially many near-multiples Xq, ..., x m of p, that is, each 
integer X; is of the hidden form x, = pq t + r, where each q, is a very large integer and each r, is a very 
small integer. In the partial version of this problem (PACD), the setting is exactly the same, except that 
xo is chosen as an exact multiple of p, namely xo = pqo where qo is a very large integer chosen such that 
no non-trivial factor of Xo can be found efficiently: for instance, [23] selects qo as a rough number, i.e. 
without any small prime factor. 
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6 . Approximate GCD Algorithm 


By definition, PACD cannot be harder than GACD, and intuitively, it seems that it should be easier 
than GACD. However, van Dijk et al. [97] mention that there is currently no PACD algorithm that does 
not work for GACD. And the usefulness of PACD is demonstrated by the recent construction [23], where 
Coron, Mandal, Naccache and Tibouchi built a much more efficient variant of the FHE scheme by van 
Dijk et al. [97], whose security relies on PACD rather than GACD. Thus, it is very important to know if 
PACD is actually easier than GACD. 

The hardness of PACD and GACD depends on how the q,’s and the r/s are exactly generated. For 
the generation of [97] and [23], the noise r,- is extremely small, and the best attack known is simply 
gcd exhaustive search: for GACD, this means trying every noise (ro, ?r) and check whether gcd (i'o — 
fo, X\ — ri) is sufficiently large and allows to recover the secret key; for PACD, this means trying every 
noise r\ and check whether gcd (xq, x\ — r\) is sufficiently large and allows to recover the secret key. In 
other words, if p is the bit-size of the noise r„ then breaking GACD (resp. PACD) requires 2 2 ' 1 (resp. 2 1 ") 
polynomial-time operations, for the parameters of [23,97], 


Our results 

We present new algorithms to solve PACD and GACD, which are exponentially faster in theory and 
practice than the best algorithms considered in [23,97], More precisely, the running time of our new 
PACD algorithm is 22' 22 polynomial-time operations, which is essentially the "square root" of that of 
gcd exhaustive search. This directly leads to a new GACD algorithm running in 2 3 C 2 polynomial-time 
operations, which is essentially the 3/4-th root of that of gcd exhaustive search. Our PACD algorithm 
relies on classical algorithms to evaluate univariate polynomials at many points, whose space require¬ 
ments are not negligible. We therefore present additional tricks, some of which reduce the space re¬ 
quirements, while still providing substantial speedups. This allows us to experimentally break the FHE 
challenges proposed by Coron et al. in [23], which were assumed to have comparable security to the 
FHE challenges proposed by Gentry and Halevi in [32]: the latter GH-FHE-challenges are based on 
hard problems with ideal lattices; according to Chen and Nguyen [19], their security level are respec¬ 
tively 52-bit (Toy), 61-bit (Small), 72-bit (Medium) and 100-bit (Large). Table 6.1 gives benchmarks for 
our attack on the FHE challenges, and deduces speedups compared to gcd exhaustive search. We can 
conclude that the FHE challenges of [23] have a much lower security level than those of Gentry and 
Halevi [34], 

Table 6.1: Time required to break the FHE challenges by Coron et al. [23]. Size in bits, running time in 
seconds for a single 2.27GHz-core with 72Gb of RAM. Timings are extrapolated for RAM > 72 Gb. 


Name 

Toy 

Small 

Medium 

Large 

Size(public key) 

0.95Mb 

9.6Mb 

89Mb 

802Mb 

Size(modulus) 

1.6 x 10 s 

0.86 x 10 6 

4.2 x 10 6 

19 x 10 6 

Size(noise) 

17 

25 

33 

40 

Expected security level 

> 42 

> 52 

> 62 

> 72 

Running time of gcd-search 

2420 

8.3 x 10 6 

1.96 x 10 1U 

1.8 x 10 13 


40 mins 

96 days 

623 years 

569193 

years 

Concrete security level 

fa 42 

fa 54 

fa 65 

fa 75 | 

Running time of the 

99 

25665 

1.635 x 10 7 

6.6 x 10 6 

6.79 x 10 10 

2.9 x 10 s 

new attack implemented 

1.6 min 

7.1 hours 

190 days 

76 days 

2153 years 

9 years 

Parameters 

d = 2 8 

d = 2 12 

d = 2 13 

d = 2 15 

d = 2 10 

d = 2 19 

Memory 

< 130 Mb 

< 15 Gb 

< 72 Gb 

fa 240 Gb 

< 72 Gb 

fa 25 Tb 

Speedup 

24 

324 

1202 

2997 

264 

62543 

New security level 

< 37.7 

< 45.7 

< 55 

< 54 

< 67 

< 59 
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Interestingly, we can also apply our technique to different settings, such as noisy factoring and 
attacking low-exponent RSA encryption. A typical example of noisy factoring is the following: assume 
that p is a divisor of a public modulus N, and that one is given a noisy version p' of p, which differs from 
p by at most k bits at unknown positions, can one recover p from ( p', N) faster than exhaustive search? 
This may have applications in side-channel attacks. Like in the PACD setting, we obtain a square-root 
attack: for a 1024-bit modulus, the speedup can be as high as 1200 in practice. Similarly, we speed up 
several exhaustive search attacks on low-exponent RSA encryption. 

Related work 

Multipoint evaluation of univariate polynomials has been used in public-key cryptanalysis before. For 
instance, it is used in factoring (such as in the Pollard-Strassen factorization algorithm [76,96] or in 
ECM speedup [65]), in the folklore square-root attack on RSA with small CRT exponents (mentioned 
by Boneh and Durfee [9], and described in [69,78]), as well as in the recent square-root attack [22] by 
Coron, Joux, Mandal, Naccache and Tibouchi on Groth's RSA Subgroup Assumption [41], But this does 
not imply that our attack is trivial, especially since the authors of [23] form a subset of the authors 
of [22], In fact, in most cryptanalytic applications (including [22]) of multipoint evaluation, one is 
actually interested in the following problem: given two lists {a,}, and {bj}j of numbers modulo N, find 
a pair {a,, bj ) such that gcd (a,- — bj, N ) is non-trivial. Instead, we use multipoint evaluation differently, 
as a way to compute certain products of m elements modulo N in 0( \fm) polynomial-time operations, 
where 0() is the usual notation hiding poy-logarithmic terms. More precisely, it applies to products 
n'T , T mod N which can be rewritten under the form H / — j O.”— -] (}/j + z k) mod N where both np and 
m 2 are 0(\Jm). The Pollard-Strassen factorization algorithm [76,96] can be viewed as a special case of 
this technique: it computes ml mod N to factor N. 

In 2011, Cohn and Heninger [20] announced an attack on PACD and GACD based on Coppersmith's 
small root technique. This attack is interesting from a theoretical point of view, but from a practical point 
of view, we show in that for the FHE challenges of [23], it is expected to be slower than gcd exhaustive 
search, and therefore much slower than our attack. 

In section 6.2, we describe our square-root algorithm for PACD, and apply it to GACD. In section 6.3, 
we discuss implementation issues, present several tricks to speed up the PACD algorithm in practice, 
and we discuss the impact of our algorithm on the fully-homomorphic challenges of Coron et al. [23]. 
Finally, we apply our main technique to different settings: noisy factoring (section 6.4) and attacking 
low-exponent RSA (section 6.5). 

6.2 A Square-Root Algorithm for Partial Approximate Common Divisors 

In this section, we describe our new square-root algorithm for the PACD problem, which is based on 
evaluating univariate polynomials at many points. In the last subsection, we apply it to GACD. 

6.2.1 Overview 

Consider an instance of PACD: xq = pq o and Xj = pt]j + r, where 0 < r, < 2 P, 1 < i < m. We start with 
the following basic observation due to Nguyen (as reported in [23, section6.1]): 

/ 2P-1 \ 

V = gcd ( XQ, n(*i -i) (modxo)j (6.1) 
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At first sight, this observation only allows to replace 2 p gcd computations (with numbers of size ~ 7 
bits) with essentially 2 p modular multiplications (where the modulus has ~ 7 bits): the benchmarks 
of [23] report a speedup of ~ 5 for the FHE challenges, which is insufficient to impact security estimates. 

However, we observe that (6.1) can be exploited in a much more powerful way as follows. We define 
the polynomial fj(x) of degree /, with coefficients modulo Xq: 


h 1 

fj( x )=n(*i - (x + i)) (mod x 0 ) 

i =0 


Letting p' = [p/2\, we notice that: 

2P-1 2P ,+ <P mod2 )-l 

n ( j i - { )= n fip' ( ip ' k ) ( m ° d x o). 

i=0 k =0 


We can thus rewrite (6.1) as: 


P = gcd 


2j0 / + (j0mod 2)_i 

Xo, El fiA lP ' k ) (modx 0 ) 
k =0 


( 6 . 2 ) 


(6.3) 


Clearly, (6.3) allows to solve PACD using one gcd, 2 P 1 -P mod 2) — 1 modular multiplications, and the 
multi-evaluation of a polynomial (with coefficients modulo .Tq) of degree 2 p at 2 p + ( p mod 2] points, where 
p' + (p mod 2) = p — p'. We claim that this costs at most 0(2^) = 0(\/2P) operations modulo Xo, which 
is essentially the square root of gcd exhaustive search. This is obvious for the single gcd and the modular 
multiplications. For the multi-evaluation part, it suffices to use classical algorithms (see [57,100]) which 
evaluate a polynomial of degree d at d points, using at most 0(d) operations in the coefficient ring. Here, 
we also need to compute the polynomial f 2P i (x) explicitly, which can fortunately also be done using 
0(\/2P) operations modulo Xq. We give a detailed description of the algorithms in the next subsection. 


6.2.2 Description 

We first recall our algorithm to solve PACD, given as algorithm 33, and which was implicitly presented 
in the overview. 

Algorithm 33 Solving PACD by multipoint evaluation of univariate polynomials 
Input: An instance (xq, *1) of the PACD problem with noise size p. 

Output: The secret number p such that Xq = pq 0 and X\ = pq\ + r\ with appropriate sizes, 
l: Set p' 4— [p/2\. 

2: Compute the polynomial /y (x) defined by (6.2), using algorithm 34. 

3: Compute the evaluation of f 2P '(x) at the 2P ,+ ^ mod2 ) points 0, 2 p ',... r 2 p ’(2 p ' + ( p mod 2 ) — 1), using 
2 P mod 2 times algorithm 35 with 2 p ' points. Each application of algorithm 35 requires the computa¬ 
tion of a product tree, using algorithm 34. 


algorithm 33 relies on two classical subroutines (see [57,100]): 

• a subroutine to (efficiently) compute a polynomial given as a product of n terms, where n is a 
power of two: algorithm 34 does this in 0(n) ring operations, provided that quasi-linear multi¬ 
plication of polynomials is available, which can be achieved in our case using Fast Fourier tech¬ 
niques. This subroutine is used in Step 2. The efficiency of algorithm 34 comes from the fact that 
when the algorithm requires a multiplication, it only multiplies polynomials of similar degree. 
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• a subroutine to (efficiently) evaluate a univariate degree-n polynomial at n points, where n is a 
power of two: algorithm 35 does this in 0 (h) ring operations, provided that quasi-linear polyno¬ 
mial remainder is available, which can be achieved in our case using Fast Fourier techniques. This 
subroutine is used in Step 3, and requires the computation of a tree product, which is achieved 
by algorithm 34. algorithm 35 is based on the well-known fact that the evaluation of a univari¬ 
ate polynomial at a point a is the same as its remainder modulo X — a, which allows to factor 
computations using a tree. 


Figure 6.1: Polynomial product tree T = {fj,..., t 2 n } for {a\,... ,a n }. 




Algorithm 34 [ T,D ] TreeProduct(A) 

Input: A set of n = 2 l numbers {a i,... ,a n }. 

Output: The polynomial product tree T = {fj,..., t 2n _]}, corresponding to the evaluation of points 
A = {fli,... ,a n } as shown in Figure 6.1. 

D = [di ,..., ^ 2 n—l] descendant indices for non-leaf nodes or 0 for leaf node. 

1: for i = 1... n do 

2 : tj <— X — £?, {Initializing leaf nodes} 

3: dj — 0 

4: end for 

5: i <— 1 {Index of lower level} 

6 : j i — H + 1 {Index of upper level} 

7: while j ^ 2h — 1 do 
8: tj 4 fj • h'+i 

9: dj 4— / 

10 : i <- i + 2 

11 : +1 
12: end while 
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Figure 6.2: Evaluation on the polynomial tree T = {t \,..., t 2 n - i } for {a\,... ,a„}. 


| / mod t 2n -i 




Algorithm 35 V RecursiveEvaluation (f,tj,D) 

Input: A polynomial / of degree n. 

A polynomial product tree rooted at t u and whose leaves are {X — a k ,...,X — a m } 
An array D = [d\, ..., d 2 n -i] descendant indices for non-leaf nodes or 0 for leaf node. 
Output: V = {f(a k ),...,f(a m )} 

1 : if dj = 0 then 

2 : return {/(a,)} {When i, is a leaf, we apply an evaluation directly.} 

3: else 

4: gi •<— / mod tj [ {left subtree} 

5 : V\ RecursiveEvaluation(^i,G.,D) 

6: f mod tdt+i {right subtree} 

7 : V2 <— RecursiveEvaluation(y2/frf ; -1/O) 

8 : return V\ U V 2 

9: end if 


It follows that the running time of algorithm 33 is 0(2^) =0(^2^) operations modulo Xq, which is 
essentially the "square root" of gcd exhaustive search. But the space requirement is 0(2^) = 0(V 2P) 
polynomially many bits: thus, algorithm 33 can be viewed as a time/memory trade-off, compared to 
gcd exhaustive search. 

6.2.3 Logarithmic speedup 

In the previous analysis, the time complexity O(n) actually stands for O (n log 2 (n)) ring multiplications. 
Interestingly, Bostan, Gaudry and Schost showed in [10] that when the structure of the factors are very 
regular, there is an algorithm which speeds up the theoretical complexity by a logarithmic term log(n). 
This BGS algorithm is tailored for the case where we want to estimate a function / on a set of points 
with what we call a hypercubic structure. An important subprocedure is ShiftPoly which, given as 
input a polynomial / of degree at most 2 d , and the evaluations of / on a set of 2 d points with hypercubic 
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structure, outputs the evaluation of / on a shifted set of 2 d points, using 0(2 d ) ring operations. More 
precisely: 

Theorem 6.2.1. (see Th. 5 of [10]) Let cc, f> be in ring P and d be in N such that d(a,f,d) is invertible, with 
d(a, f>,d) = f> ■ 2.. .d • {a — dp )... (a + dp). And suppose also that the diverse of d(cc, p,d) is known. Let 
F(-) E P [X] of degree at most d and Xo E P. There exists an algorithm Shi ftPoly which, given as input F(xo), 
F(x o + jS), ..., F(x o + dfi), outputs F(x o + a), F(xo + oc + p), ..., F(xo + a + dp) in time 2 M(d) + 0(d) 
time and space O(d). Here, M(d) is the time of multiplying two polynomial of degree at most d. 

We note E(ki,.. .,kf) for x Pkf 1 ' 2 j with each p^. ranging over {0,1}. This is the set enumerating 
all possibilities of bits {fci,..., kj}. Given a set A and an element and p, A + p is defined as{fl + p\\/a E 
A}. Then we have 

E(k i,.. .,k j+ 1 ) = E(k i,.. .,kj) U (E(k lf .. .,kj) + 2 k i +1 j . 

This is what we call a set with hypercubic structure. 

Given a linear polynomial f(x) and a set with hypercubic structure of 2 P points, the proposed al¬ 
gorithm iteratively calls algorithm36 which uses Shi ftPoly, and calculates the evaluation of F,- (X) = 

UreEk .*,) /(A + Y) on E(b k _ j,..., k p ) until i = [n/2 J. The i -th iteration costs 0(2') ring operations, 

thus the total complexity amounts to 0 ( 2 p/1 ) ring operations. 


Algorithm 36 /-th iteration of the evaluation of FfX) 

Input: For i = 1,..., |_p/2_|, the evaluation of FfX) on points X S E(k p _i + \,... ,k p ) 
Output: the evaluation of F i+1 (X) on points X S E(k p _ t , ... ,k p ) 
l: Fj(X) for X G E(k p -i + i,... , k p ) + 2 k p~' E- Shif tPoly(F,(X), X G E(k p _j + . .., k p )) 

2 : F f (X) for X G E(k p _i,...,k p )+ 2 k ^ E- ShiftPoly (F f (X),X G E(k p -i . k p )) 

3: F i+ 1 (X) = F f (X) • F,(X + 2 k ‘+t), for all X G E(k p - if ... ,k p ) 


6.2.4 Application to GACD 

Any PACD algorithm can be used to solve GACD, using the trivial reduction from GACD to PACD 
based on exhaustive search over the noise r q. More precisely, for an arbitrary instance of GACD: 

Xj = pqi + r; where 0 < r, < 2 P , 0 < i < m 

we apply our PACD algorithm for all pairs (xo — r 0 , %i) where ro ranges over {0,... ,2 p — 1}. 

It follows that GACD can be solved in 0(2 3p//2 ) operations modulo Xq, using 0(2 p/ ' 2 ) polynomially 
many bits. This is exponentially faster than the best attack of [97], namely gcd exhaustive search, which 
required 2 lp gcd operations. Note that in [97], another hybrid attack was described, where one performs 
exhaustive search over ro and factor the resulting number using ECM, but because of the large size of the 
prime factors (namely, a bit-length > p 2 ), this attack is not faster: it also requires at least 2 lp operations. 

Following our work, it is noted with [24] that one can heuristically beat the GACD bound 0(2 3 ^ 2 ) 
using more samples of X;, by removing the "smooth part" of gcd(yi,... ,y s ) where i/, = n ^ 1 ( x i ~ j ) 
and s is large enough. The choice of s actually gives different time/memory trade-offs. For instances, 
if s = 0(p), the running time is heuristically 0(2 P ) poly-time operations and similar memory. From a 
practical point of view however, our attack is arguably more useful, due to memory requirements and 
better 0 () constants. 
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6.3 Implementation of the Square-Root PACD Algorithm 

We implemented both algorithm 33 and the logarithmic speedup using the NTL library [91]. In this sec¬ 
tion, we describe various tricks that we used to implement efficiently algorithm33. The implementation 
was not straightforward due to the size of the FHE challenges. 

6.3.1 Obstructions 

The main obstruction when implementing algorithm 33 is memory. Consider the Large FHE-challenge 
from [23]: there, p = 40, so the optimal parameter is p' = 20, which implies that /y is a polynomial 
of degree 2 20 with coefficients of size 19 x 10 6 bits. In other words, simply storing /y already requires 
2 20 x 19 x 10 6 bits, which is more than 2Tb, while we also need to perform various computations. This 
means that in practice, we will have to settle for suboptimal parameters. 

More precisely, assume that we select an additional parameter d, which is a power of two less than 
2 p '. We rewrite (6.3) as: 

/ 2P/d-\ \ 

V = gcd x 0 , n fd( dk ) (mod Vo) ] (6.4) 

V k= 0 J 

This gives rise to a constrained version of algorithm 33, called algorithm 37. 


Algorithm 37 Solving PACD by multipoint evaluation of univariate polynomials, using fixed memory 

Input: An instance (xq, X\ ) of the PACD problem with noise size p, and a polynomial degree d (which 
must be a power of two). 

Output: The secret number p such that Xo = pqo and X\ = pq\ + r\ with appropriate sizes. 

1 : Compute the polynomial fd{x) defined by (6.2), using algorithm 34. 

2 : Compute the evaluation of fd (x) at the 2 P/d points 0,d,2d,... ,d(2 p /d — 1), using 2 P/d 2 times algo¬ 
rithm 35 with d points. Each application of algorithm 35 requires the computation of a product tree, 
using algorithm 34. 


The running time of algorithm 37 is elementary operations modulo Xq, and the space require¬ 
ment is 0(d) polynomially many bits. Note that each of the 2 P/d 1 times applications of algorithm 35 
can be done in parallel. 

6.3.2 Tricks 

The use of algorithm 37 allows several tricks, which we now present. 

Minimizing the Product Tree 

Each application of algorithm 35 requires the computation of a product tree, using algorithm 34. But 
this product tree requires to store 2 n — 1 polynomials. Fortunately, these polynomials have coefficients 
which are in some sense much smaller than the modulus Xo: this is because we evaluate the polynomial 
fd(x) at points in {0,... ,2 p — 1}, which is very small compared to the modulus Xq. However, a naive 
implementation would not exploit this. For instance, consider the polynomial (X — a\ ) (X — ^ 2 ) = X 2 — 
(ci ] + ci 2 )X + fl| cij, which belongs to the product tree. In a typical library for polynomial computations, 
the polynomial coefficients would be represented as positive residues modulo Xo- But if fl | + 02 is small, 
then — (fl ] + 02 ) + Xq is actually big. This means that many coefficients of the product tree polynomials 
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will actually be as big as Xq, if they are represented as positive residues modulo Xq, which drastically 
reduces the choice of the degree d. 

To avoid this problem, we instead slightly modify the polynomial /^(X), in order to evaluate at small 
negative numbers inside {0,..., 1 — 2 P }, so that each polynomial of the product tree has "small" positive 
coefficients. This drastically reduces the storage of the product tree. More precisely, we rewrite (6.4) as: 

/ 2P/d 2 —l d—\ \ 

P = gcd x 0 , n TIU-M) ( m od v 0 ) (6.5) 

\ e=o ) 

where 

d -1 

fd,k( x ) = U(^- 2P - x + dk — i) ( modxo) (6.6) 

/=o 

Each product Yl(Zofdk (~■O (mod jcq) is computed by applying algorithm 35 once, using the d points 
0,-d,-2d,...,-d(d-l). 

Powers of Two 

We need to compute the polynomial f' ik (x) defined by (6.6) before each application of algorithm 35, 
using a simplified version of algorithm 34, which only computes the root rather than the whole product 
tree. However, notice that the degree of each polynomial of the product tree is exactly a power of two, 
which is the worst case for the polynomial multiplication implemented in the NTL library [91]. For 
instance, in NTL, multiplying two 512-degree polynomials with Medium-FHE coefficients takes 50% 
more time than multiplying two 511-degree polynomials with Medium-FHE coefficients. 

To circumvent threshold phenomenons, we notice that each polynomial of the product tree is a 
monic polynomial, except the leaves (for which the leading coefficient is -1). But the product of two 
monic polynomials whose degree is a power of two can be derived efficiently from the product of two 
polynomials with degree strictly less than the power of two, using: 

(X n + P(X)) x (X" + Q(X)) = X 2 " + X H (P(X) + Q(X)) + P(X)Q(X). 

We apply this trick to speed up the computation of the polynomial f' d k (x). 

Precomputations 

Now that we use (6.5), we change several times the polynomial but we keep the same evalu¬ 

ation points 0, —d, —2d, ..., —d(d — 1), and therefore the same product tree. This allows to perform 
precomputations to speed up algorithm 35. Indeed, the main operation of algorithm 35 is computing 
the remainder of a polynomial with one of the product tree polynomials, and it is well-known that this 
can be sped up using precomputations depending on the modulus polynomial. One classical way to do 
this is to use Newton's method for remainder (algorithm 38). This algorithm requires the following no¬ 
tation: for any polynomial / of degree n and for any integer m g n, we define the m-degree polynomial 
rev(/, m) as r ev(f,m) = /(1/X) • X m . In algorithm 38, Line 1 is independent of /. Therefore, when¬ 
ever one needs to compute many remainders with respect to the same modulus g, it is more efficient 
to precompute and store h, so that Line 1 does not need to be reexecuted. Hence, in an offline phase, 
we precompute and store (on a hard disk) the polynomial g of Line 1 for each product tree polynomial. 
And for each remainder required by algorithm 35, we execute the last two lines of algorithm 38. 
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Algorithm 38 Remainder using Newton's method (see [57, section7.2]) 

Input: Polynomials / G 1R[X] of degree 2 n — 1, g G R[X] of degree n. 
Output: The polynomial h = f mod g 
1 : g <— Inverse (rev (g, n)) mod X” 

2: s <— rev (/, 2n — 1) ■ g mod X n 
3: h <- f — rev(s, n — 1) • g 


It follows that each remainder operation of algorithm 35 is reduced to two polynomial multiplica¬ 
tions. 

The NTL library also contains routines for doing remainders with precomputations, but algo¬ 
rithm 38 turns out to be more efficient for our setting. This is because many factors impact the per¬ 
formance of polynomial arithmetic, such as the size of the modulus and the degree. 

6.3.3 Logarithmic Speedup and Further Tricks 

We also implemented the BGS algorithm described in section 6.2.3, which offers an asymptotical loga¬ 
rithmic speedup, but our implementation was not optimized due to lack of time: a good implementation 
would require the so-called middle product [10], which we instantiated by a normal product. On the 
FHE challenges, our implementation turned out to be twice as slow as algorithm 33 for Medium and 
Large, and marginally slower (resp. faster) for Toy (resp. Small). 

Since memory is the main obstruction for choosing d, it is very important to minimize RAM require¬ 
ments. Since algorithm 35 can be reduced to multiplications using precomputations, one may consider 
the use of special multiplication algorithms which require less memory than standard algorithms, such 
as in-place algorithms. We note that there has been recent work [45,83] in this direction, but we did not 
implement these algorithms. This suggests that our implementation is unlikely to be optimal, and that 
there is room for improvement. 

6.3.4 New Security Estimates for the FHE Challenges 

Table 6.1 reports benchmarks for our implementation on the fully-homomorphic-encryption challenges 
of Coron et al. [23], which come in four flavours: Toy, Small, Medium and Large. The security level £ is 
defined in [23] is defined as follows: the best attack should require at least 2 f clock cycles on a standard 
single core. The row "Expected security level" is extracted from [23]. 

Our timings refer to a single 2.27GHz-core with 72Gb of RAM. First, we assessed the cost of gcd 
exhaustive search, by measuring the running time of the (quasi-linear) gcd routine of the widespread 
gmp library, which is used in NTL [91]: timings were measured for each modulus size of the four 
FHE-challenges. This gives the "concrete security level" row, which is slightly higher than the expected 
security level of [23]. 

We also report timings for our implementation of our square-root PACD algorithm: these timings 
are below the expected security level, which breaks all four FHE-challenges of [23]. For the Toy and 
Small challenges, the parameter d was optimal, and we did not require much memory: the speedup 
is respectively 24 and 324, compared to gcd exhaustive search. For the Medium and Large challenges, 
we had to use a suboptimal parameter d, due to RAM constraints: we used d = 2 13 (resp. d = 2 10 ) for 
Medium (resp. Large), instead of the optimal d = 2 16 (resp. d = 2 20 ). But the speedups are already 
significant: 1202 for Medium, and 264 for Large. The timings are obtained by suitably multiplying the 
running time of a single execution of algorithm 35 and algorithm 34: for instance, in the Large case, this 
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online phase took between 64727s to 65139.4s, for 5 executions, and the precomputation storage was 
21Gb. 

Table 6.1 also provides extrapolated figures if the RAM was > 72 Gb, which allows larger val¬ 
ues of d: today, one can already buy servers with 4-Tb RAM. For the Large challenge, the potential 
speedup is over 60,000. Using a more optimized implementation, we believe it is possible to obtain 
larger speedups, so the New security level row should only be interpreted as an upper bound. But our 
implementation is already sufficient to show that the FHE-challenges of [23] fall short of the expected 
security level. 

Hence, one needs to increase the parameters of the FHE scheme of [23], which makes it less compet¬ 
itive with the FHE implementation of [34], It can be noted that the new security levels of the challenges 
of [23] are much lower than those given by [19] on the challenges of Gentry and Halevi [34], namely 
52-bit (Toy), 61-bit (Small), 72-bit (Medium) and 100-bit (Large). 


6.4 Applications to Noisy Factoring 


Consider a typical "balanced" RSA modulus N = pq where p, q < 2 %/N. A celebrated lattice-based 
cryptanalysis result of Coppersmith [21] states that if one is given half of the bits of p, either in the 
most significant positions, or the least significant positions, then one can recover p and q in polynomial 
time. Although this attack has been extended in several works (see [59] for a survey), all these lattice- 
based results require that the unknown bits are consecutive, or spread across extremely few blocks. This 
decreases its potential applications to side-channel attacks where errors are likely to be spread unevenly. 

This suggests the following setting, which we call noisy factoring. Assume that one is given a noisy 
version p' of the prime factor p, which differs from p by at most k bits, not necessarily consecutive, 
under either of the following two cases: 


• If the k positions of the noisy bits are known, we can recover p (and therefore q) by exhaustive 
search using at most 2 k polynomial-time operations: we stress that in this case, we assume that 
we do not know if each of the k bits has been flipped, otherwise no search would be necessary. 


If instead, none of the positions is known, but we know that exactly k bits have been modified, 

n 
k 

n is the bit-length of p. If we only know an upper bound on the number of modified bits, we can 
simply repeat the attack with decreasing values of k. 


we can recover p by exhaustive search using at most 


polynomial-time operations, where 


These running times do not require that p and q are balanced. 

In this section, we show that our previous technique for PACD can be adapted to noisy factoring, 
yielding new attacks whose running time is essentially the "square root" of exhaustive search, that is. 


0(2 k/2 ) or Q( 1 


) polynomial-time operations, depending on the case. Finally, we also extend 


our method to fault attacks of RSA signature scheme. 


6.4.1 Known positions 

We assume that the prime number p has n bits, so that: p = Y^=o P^' r where p, E {0,1} for 0 EM' G 
n — 1. 

In this subsection, we assume that all the bits p, are known, except possibly at k positions b \,..., lp : , 
which we sort, so that: 0 < b\ ^ ^ b^ < n. Denote by p ! 1! ,..., p' 2 ^ the 2 k possibilities for p, when 
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(pjj,..., Pf, k ) ranges over {0,1 \ k . With high probability, all the p lr, 's are coprime with N, except one, 
which would imply that: 

p = gcd ^N,]^[// ! )(mod N)^ (6.7) 

A naive evaluation of (6.7) costs 2 k modular multiplications, and one single gcd. We now show that this 
evaluation can be performed more efficiently using 0 ( 2 k ^ 2 ) arithmetic operations with numbers with 
the same size as N. 

The unknown bits ..., p;, f can be regrouped into two sets {pi h , ■ ■ ■, Pb t }, and {pb e+1 > ■ ■ ■ > Pk} of 
roughly the same size l = [k /2j, as illustrated in Figure 6.3: 

f 0 if j > be 

• For 1 ^ i < 2 k , let yW = Y^=o where y [ - ) = < f-th bit of i if 3 1 ^ £,j = b t , 

I Pi otherwise 


• For 1 ^ i < 2 k e , let x^ 


r f ° 

o x pi> where Xj = < f-th bit of i 

l Pi 


if j < bi 

if 3 1 > l,j = b t , 
otherwise 


Figure 6.3: Splitting the Unknown Bits in Two 


p with k unknown bits 

y (i) ■ 

x (i) . 


h • ■ ■ • • "h + i 

b( ■ ■ ■ ■ ■ ■ f >2 &i 

^(XXXX ^(XXXXXX 

xxxxx xxxxxxx 



o 

o 

o 

o 

XXXXX ^(XXXXXX 



111111111 |xxxxx| 111111111111111111111 Ixxxxxxx 

P0T3 ...00 | 


Hence, by definition of x 1 ' 1 ' 1 and \j [l] , we have: 

2 k 2 e 2 k_t 

= nnC^^') + 3 /( l )) (modN) ( 6 . 8 ) 

i= 1 i= 1 ;'=1 

which gives rise to a square-root algorithm (algorithm 39) to solve the noisy factorization problem with 

known positions. 

Algorithm 39 Noisy Factorization With Known Positions 

Input: An RSA modulus N = pq and the bits po,. ■ ■, p n -1 of p, except the k bits pb u ■ ■ ■, Pb k , where the 
bit positions b\ < b 2 < • • • < b^ are known. 

Output: The secret factor p = Yli=o P/2' of N. 

1 : Compute the polynomial /(X) = (x +i/') j mod N of degree 2 ', with coefficients modulo N, 
using algorithm 34. 

2 : Compute the evaluation of /(X) at the points {x^\ ..., x^ ‘ J }, using 1 + (k mod 2) times algo¬ 
rithm 35 with 2 / points. 

3 : return p gcd (n,Y^Li (/(*^)) mod N^j 
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Similary to section 6.2, the cost of algorithm 39 is 0(2 kk2 ) polynomial-time operations. This is an 
exponential improvement over naive exhaustive search, but algorithm 39 requires exponential space. 
In practice, the improvement is substantial. Using our previous implementation, algorithm 39 gives 
a speedup of about 1200 over exhaustive division to factor a 1024-bit modulus, given a 512-bit noisy 
factor with 46 unknown bits at known positions. 

Furthermore, in this setting, the points to be enumerated happen to satisfy the hypercubic property, 
thus we may apply the logarithmic speedup described in section 6.2.3. 

Remember that the factor p can be calculated with formula ( 6 . 8 ). Now we can restate it as 

2 k 

Y[p {l) = n n (x + y + Mp) (modN), ( 6 . 9 ) 

'=1 yeE(b e+1 ,...,b k ) x€E(b lr -,be) 

here M p = ^ p/2 l is the known bits of p . We define F,(X) = Yli/eE(b lr ...,bi) (X + y + Mp). 

ig{bi,...,b k } 

Algorithm 40 Improved Noisy Factorization With Known Positions 

Input: An RSA modulus N = pq and a number p' differing from p by exactly k bits of unknown 
position. 

Output: The secret factor p. 

1: Fo( 0 ) Mp 

2: for i = 1 ,..., [k/2\ do 

3: Call algorithm 36 to calculate the evaluation of F;(X) on E(b k __ u b k ) given the evaluation of F ; - i (X) 

on E(b k _ i+lf b k ) 

4: end for 

5: if k is odd then 

6 : The evaluation Fy k/2 j (X) for X E E(fe^/ 2 j + 2 ,..., bp) + 2 h \- k/2 i +1 

ShiftPoly(Fp/ 2 j (X), X G E(b^ k/ 2\+2> ■ ■ ■ ,b k ),2 b t k/2 l+ 1 ) 


end if 


p ,, =gcd(N,nX6E(6 Lt/2J+ v 

>: return p" 

■ A) (W X ») 


As discussed in section 6.2, the cost of algorithm 40 is faster than algorithm 39 by a factor of 0(k ) on 
a theoretical basis. 

6.4.2 Unknown positions 

In this subsection, we assume that p' differs from p by exactly k bits at unknown positions, and that p' 
has bit-length n. Our attack is somewhat reminiscent of Coppersmith's baby-step/giant-step attack on 
low-Hamming-weight discrete logarithm [95], but that attack uses sorting, not multipoint evaluation. 
To simplify the description, we assume that both k and n are even, but the attack can easily be adapted 
to the general case. 

Pick a random subset S of {0 ,... ,n — 1} containing exactly n/2 elements. The probability that S 
contains the indices of exactly k/2 flipped bits is: ^ ^ ^ ^ ~ We now assume that this 

. Similarly to the previous subsection, we define: 


event holds, and let £ = 


n/2 

k/2 
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• Let x W for 1 ^ i < £be the numbers obtained by copying the bits of p' at all the positions inside 
S, and flipping exactly k/2 bits: all the other bits are set to zero. 

• Let yW for 1 ^ i < £ be the numbers obtained by copying the bits of p' at all the positions outside 
S, and flipping exactly k/2 bits: all the other bits are set to zero. 

Now, with high probability over the choice of ( p,c]), we may write: 

l l 

p = gcd(N,]^[ +y^) (modN)) (6.10) 

i= 17=1 

which gives rise to a square-root algorithm (algorithm 41) to solve the noisy factorization problem with 
unknown positions. 

Algorithm 41 Noisy Factorization With Unknown Positions 

Input: An RSA modulus N = pq and a number p' differing from p by exactly k bits of unknown 
position. 

Output: The secret factor p. 

1: repeat 

2 : Pick a random subset S of {0, ... ,n — 1} containing exactly n/2 elements. 

3: Compute the integers and y W for 1 ^ i < £ = 

4: Compute the polynomial /(X) = n|=i (X + y ^) mod N. 

5: Compute the evaluation of/(X) at the £ points {V 1 ),... ,x^}. 

6 : p" <— gcd (N,nf=i mod 

7: until p" > 1 
8: return p" 



Similary to section 6.2, the expected cost of algorithm 41 is 0(£\/k) polynomial-time operations, 
n/2 


where £ = 


k/2 


is roughly 


. This is an exponential improvement over naive exhaustive 


search, but algorithm 41 requires exponential space. 

algorithm 41 is randomized, but like Coppersmith's baby-step/giant-step attack on low-Hamming- 
weight discrete logarithm [95], it can easily be derandomized using splitting systems. Deterministic 
versions are slightly less efficient, by a small polynomial factor: see [95]. 


6.4.3 Application to fault attacks on CRT-RSA signatures 

Consider an RSA signature s = m d mod N, where N = pq. In practice, this calculation is often ac¬ 
celerated using the Chinese remainder theorem. More precisely, s is derived from Sp and sy, where 
Sp = m d mod p and sy = m d mod q. It is well-known that if a fault occurs during the computation of sy 
(but not Sq), then the output s’ will satisfy s' ^ m d mod p and s' = m d mod q. Then the factorization of 
N is disclosed by 


p = gcd(s /£> — m mod N,N ) 

However, the message m may have already gone through some padding, so that only part of its bits 
are known. The positions of the unknown bits might be known, or not, which corresponds to the two 
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situations presented in the previous discussion. The same technique to speed up the enumeration can 
be applied, by substituting (6.7) with 


/ 2 k 

V = gcd ( N,f](s /e 


m^)( mod N)\ 


( 6 . 11 ) 


6.5 Applications to Low-Exponent RSA 


In this section, we show that our previous algorithms for noisy factoring can be adapted to attacks 
on both low-exponent RSA encryption. Consider an RSA ciphertext c = m e mod N, where the public 
exponent e is very small. Assume that one knows a noisy version in' of the plaintext m, which differs 
from m by at most k bits, not necessarily consecutive, under either of the following two cases: 


• If the k positions of the noisy bits are known, we can recover m by exhaustive search using at most 
2 k polynomial-time operations: we stress that in this case, we assume that we do not know if each 
of the k bits has been flipped, otherwise no search would be necessary. 


If instead, none of the positions is known, but we know that exactly k bits have been modified, 

n 
k 

n is the bit-length of in. If we only know an upper bound on the number of modified bits, we can 
simply repeat the attack with decreasing values of k. 


we can recover m by exhaustive search using at most 


polynomial-time operations, where 


This setting is usually called stereotyped RSA encryption [21]: there are well-known lattice at¬ 
tacks [21,59] against stereotyped RSA, but they require that the unknown bits are consecutive, or split 
across extremely few blocks. 


6.5.1 Known Positions 


Assume that m is a plaintext of n bits, among which only k bits are unknown, whose (arbitrary) positions 
are b \,..., fy. Let c = m e mod N be the raw RSA ciphertext of in. If e is small (say, constant), we can 
"square root" the time of exhaustive search, using multipoint polynomial evaluation. 

Let i = | (k — log 2 e)/2\, and assume that k > 0. 


m : 


b 


k ■ ■ ■ 


...b 


e+i 


b e 


b 2 b 


i 


ll^ xxx ^lll 


| [xxxxxxxxxxxx 


xxxxxxx 


Let niQ be derived from in by keeping all the known n — k bits, and setting all the k unknown bits to 

0 . 


m 0 : 


IPOOOOIII 


ipooooooooooq 


0000000 


For 1 ^ i ^ 2 k 1 ’, let the X{ s enumerate all the integers when (be+ 1 ,..., b/k) ranges over {0, 1 } k 1 . 

x . ■ 10000|xxxxxp0000000000^xxxxxx0000000000000000000000| 

Similarly, for 1 ^ j ^ 2', let the yfs enumerate all the integers when (b\,..., b/>) ranges over {0, 1 } 1 
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Vi ■ 


lOOOOOOOOOOOOOOOOOOOOOOOOOOOxxxxxpOOOOOOOOO^ 


xxxxxx 


Thus, by construction, there is a unique pair ( i,j ) such that: 

c = (mo + Xi + x/j) e mod N. 

2 e 

Now, we define the polynomial /(X) = ]^[ ((mo + j/ 2 - + X) e — c) mod N, which is of degree el 1 . If x t 

i =1 

corresponds to the correct guess for the bits bg +..., fy, then f(xt) = 0. Hence, if we evaluate /(X) at 
X\,... , x 2 c o we would be able to derive the k — £ higher bits bg + \,.. . iy, which gives rise to algorithm 42. 


Algorithm 42 Decrypting Low-Exponent RSA With Known Positions 

Input: An RSA modulus N = pq and a ciphertext c = nf mod N, where all the bits of in are known, 
except at k positions b \,..., fy. 

Output: The plaintext m. 

2 l 

l: Compute the polynomial /(X) = ((mo + iji + X) e — c ) mod N of degree el, with coefficients 

i=i 

modulo N, using algorithm 34. 

2 : Compute the evaluation of /(X) at the points x ^\..., x (2 “ '', using sufficiently many times algo¬ 
rithm 35. 

3: Find the unique i such that f(x il) ) = 0. 

4: Deduce from x W the bits b( + 1 , ... ,bk- 

5: Find the remaining bits bi,...,b( by exhaustive search. 


By definition of £, we have: \Jl k /e < 2 f < 2 x y/2 k /e and sqrte 2^/2 < 2 k ~ { < 2 x Ve2 k . 

It follows that the overall complexity of algorithm 42. is O ( Ve2 k ) polynomial-time operations, which 
is the "square root" of exhaustive search if e is constant. 

6.5.2 Unknown Positions 

In the previous section, we showed how to adapt our noisy factoring algorithm with known positions 
(algorithm 39) to the RSA case. Similarly, our noisy factoring algorithm with unknown positions (algo¬ 
rithm 41) can also be adapted. If the plaintext m is known except for exactly k unknown bit positions, 

then one can recover m using on the average 0(l\/ke ) polynomial-time operations, where £ = 
is roughly ) • 

6.5.3 Variants 

Our technique was presented to decrypt stereotyped low-exponent RSA ciphertexts, but the same tech¬ 
nique clearly applies to a slightly more general setting, where the RSA equation is replaced by an ar¬ 
bitrary univariate low-degree polynomial equation. More precisely, instead of c = nf mod N, we may 
assume that P(m) = 0 (mod N ) where P is a univariate integer polynomial of degree e. This allows to 
adapt various attacks [21] on low-exponent RSA, such as randomized padding across several blocks. 
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